Re: [R-es] PROBLEMAS NOMBRES DE COLUMNAS CON ESPACIOS CONEXION R-SQL

2019-04-16 Thread Jorge I Velez
Hola Ana,
Lo que ocurre es que “Corteoptimo” es diferente a “Corte optimo”. Lo mismo
ocurre con “Cortediairo”.  Por eso no reconoce que existen.
Saludos,
Jorge.-

El El mar, 16 de abr. de 2019 a las 8:56 a. m., Ana Jimenez Rebollo <
anaji...@gmail.com> escribió:

> Buenas tardes,
>
> Estoy tratando de realizar un update en SQL desde R:
> sqlUpdate(conexion1, data.frame(AUXILIAR), tablename = "AUXILIAR")
>
> y me devuelve el siguiente error:
> *Error in sqlUpdate(conexion1, data.frame(AUXILIAR), tablename =
> "AUXILIAR",  : *
> *  data frame column(s) Corteoptimo Cortediario not in database table*
>
> El problema es que sí existen esas columnas en SQL pero parece que R está
> almacenando internamente los nombres sin espacios ya que si ejecuto la
> función names en R para el data.frame AUXILIAR (que es con el que quiero
> actualizar en R):
> names(AUXILIAR)
>  [1] "Date""Time""Seasson"
>"TimeH"   "Day" "Holiday"
>
>  [7] "Corte optimo""Corte diario"
>
> Y si obtengo los nombres de la tabla SQL:
> sqlColumns(conexion1, "AUXILIAR")
> TABLE_CAT TABLE_SCHEM TABLE_NAME COLUMN_NAME
> OS MKOnline dbo AUXILIAR Date
> OS MKOnline dbo AUXILIAR Time
> OS MKOnline dbo AUXILIAR Seasson
> OS MKOnline dbo AUXILIAR TimeH
> OS MKOnline dbo AUXILIAR Day
> OS MKOnline dbo AUXILIAR Holiday
> OS MKOnline dbo AUXILIAR Corte optimo
> OS MKOnline dbo AUXILIAR Corte diario
>
>
> Los nombres son idénticos por lo que intuyo que el problema viene en el
> nombre almacenado internamente. He probado a renombrar las columnas del
> data.frame sustituyendo los espacios por puntos y por guiones bajos pero me
> devuelve el mismo error.
> PD: El data.frame con el que quiero actualizar en SQL y la tabla llevan el
> mismo nombre, siento si eso lleva a algún tipo de confusión.
> ¿Alguna sugerencia?
>
> Muchas gracias de antemano.
>
> Saludos,
> AJR.
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>
-- 
Sent from my phone. Please excuse my brevity and misspelling.

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] PROBLEMAS NOMBRES DE COLUMNAS CON ESPACIOS CONEXION R-SQL

2019-04-16 Thread Javier Marcuzzi
Estimada Ana Rebollo

Intente algo simple, str(datos) y compare los nombres de R con los de sql
en la base de datos, lo ideal es que coincidan, es mucho más simple, la
segunda alternativa es leer como se generan los códigos SQL (INSERT …..),
esto complica un poco el trabajo pero también es posible.

Javier Rubén Marcuzzi

El mar., 16 abr. 2019 a las 11:59, Javier Nieto ()
escribió:

> Hola Ana
>
> Seguramente, como dices, es problema de los nombres internos. Sucede que
> lo habitual y como buena práctica los nombres de los campos no deben llevar
> espacios ni caracteres como acentos o la letra ñ. Mi sugerencia es que
> cambies los nombres con espacios a nombres sin espacios.
>
>
> Saludos
> 
> De: R-help-es  en nombre de Ana Jimenez
> Rebollo 
> Enviado: martes, 16 de abril de 2019 08:55 a. m.
> Para: r-help-es@r-project.org
> Asunto: [R-es] PROBLEMAS NOMBRES DE COLUMNAS CON ESPACIOS CONEXION R-SQL
>
> Buenas tardes,
>
> Estoy tratando de realizar un update en SQL desde R:
> sqlUpdate(conexion1, data.frame(AUXILIAR), tablename = "AUXILIAR")
>
> y me devuelve el siguiente error:
> *Error in sqlUpdate(conexion1, data.frame(AUXILIAR), tablename =
> "AUXILIAR",  : *
> *  data frame column(s) Corteoptimo Cortediario not in database table*
>
> El problema es que sí existen esas columnas en SQL pero parece que R está
> almacenando internamente los nombres sin espacios ya que si ejecuto la
> función names en R para el data.frame AUXILIAR (que es con el que quiero
> actualizar en R):
> names(AUXILIAR)
>  [1] "Date""Time""Seasson"
>"TimeH"   "Day" "Holiday"
>
>  [7] "Corte optimo""Corte diario"
>
> Y si obtengo los nombres de la tabla SQL:
> sqlColumns(conexion1, "AUXILIAR")
> TABLE_CAT TABLE_SCHEM TABLE_NAME COLUMN_NAME
> OS MKOnline dbo AUXILIAR Date
> OS MKOnline dbo AUXILIAR Time
> OS MKOnline dbo AUXILIAR Seasson
> OS MKOnline dbo AUXILIAR TimeH
> OS MKOnline dbo AUXILIAR Day
> OS MKOnline dbo AUXILIAR Holiday
> OS MKOnline dbo AUXILIAR Corte optimo
> OS MKOnline dbo AUXILIAR Corte diario
>
>
> Los nombres son idénticos por lo que intuyo que el problema viene en el
> nombre almacenado internamente. He probado a renombrar las columnas del
> data.frame sustituyendo los espacios por puntos y por guiones bajos pero me
> devuelve el mismo error.
> PD: El data.frame con el que quiero actualizar en SQL y la tabla llevan el
> mismo nombre, siento si eso lleva a algún tipo de confusión.
> ¿Alguna sugerencia?
>
> Muchas gracias de antemano.
>
> Saludos,
> AJR.
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R-es] PROBLEMAS NOMBRES DE COLUMNAS CON ESPACIOS CONEXION R-SQL

2019-04-16 Thread Ana Jimenez Rebollo
Buenas tardes,

Estoy tratando de realizar un update en SQL desde R:
sqlUpdate(conexion1, data.frame(AUXILIAR), tablename = "AUXILIAR")

y me devuelve el siguiente error:
*Error in sqlUpdate(conexion1, data.frame(AUXILIAR), tablename =
"AUXILIAR",  : *
*  data frame column(s) Corteoptimo Cortediario not in database table*

El problema es que sí existen esas columnas en SQL pero parece que R está
almacenando internamente los nombres sin espacios ya que si ejecuto la
función names en R para el data.frame AUXILIAR (que es con el que quiero
actualizar en R):
names(AUXILIAR)
 [1] "Date""Time""Seasson"
   "TimeH"   "Day" "Holiday"

 [7] "Corte optimo""Corte diario"

Y si obtengo los nombres de la tabla SQL:
sqlColumns(conexion1, "AUXILIAR")
TABLE_CAT TABLE_SCHEM TABLE_NAME COLUMN_NAME
OS MKOnline dbo AUXILIAR Date
OS MKOnline dbo AUXILIAR Time
OS MKOnline dbo AUXILIAR Seasson
OS MKOnline dbo AUXILIAR TimeH
OS MKOnline dbo AUXILIAR Day
OS MKOnline dbo AUXILIAR Holiday
OS MKOnline dbo AUXILIAR Corte optimo
OS MKOnline dbo AUXILIAR Corte diario


Los nombres son idénticos por lo que intuyo que el problema viene en el
nombre almacenado internamente. He probado a renombrar las columnas del
data.frame sustituyendo los espacios por puntos y por guiones bajos pero me
devuelve el mismo error.
PD: El data.frame con el que quiero actualizar en SQL y la tabla llevan el
mismo nombre, siento si eso lleva a algún tipo de confusión.
¿Alguna sugerencia?

Muchas gracias de antemano.

Saludos,
AJR.

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] SQL Database

2018-07-26 Thread MacQueen, Don via R-help
Harold,

I don't have much experience with ODBC/RODBC, but given that it's working on 
Win, a driver problem seems plausible.

-Don

--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
 
 

On 7/26/18, 9:37 AM, "Doran, Harold"  wrote:

Thanks for this. I'm using the RODBC stuff now. It works well and is 
currently embedded in a shiny app. So, the entire SQL stuff is transparent to 
the user who simply interacts with the UI. It appears to be working in a local 
windows version. That is, I can successfully open the connection, do my 
sqlQuery, and save those data as objects in the R session. 

But when I run the same code on my dev server (which runs Centos 7), the 
code is breaking and it is seemingly related to the driver. It just cannot open 
the connection. That portion of my code is (with certain things blanked out for 
security):

cn <- odbcDriverConnect(connection="Driver={SQL Server Native Client 11.0};
server=1.1.1.1; 
database=xyz;
uid=*;
pwd=***;"
)

I'm doing my homework now on the right drivers that might be appropriate 
for centos, but if anyone happens to know, hints are appreciated 

Harold


-Original Message-
From: MacQueen, Don [mailto:macque...@llnl.gov] 
Sent: Thursday, July 26, 2018 11:26 AM
To: Doran, Harold ; 'r-help@r-project.org' 

    Subject: Re: [R] SQL Database

From my point of view, the logic is this:

  If the external database is Oracle, use ROracle
  If the external database is MySQL, use RMySQL and similarly for other 
databases

If there is no R package specific to the database, then you drop back to 
RODBC or RJDBC. Hopefully you can get the necessary drivers or java files to 
support the database

Your steps look good (I do them all the time with Oracle and MySQL), and 
realize that you don't have to grab an entire table; you can send SQL queries 
that join tables and subset rows, etc. You can also write results back to the 
database if that's useful.

I prefer to use packages that are based on the DBI package.

-Don

--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
 
 

On 7/25/18, 7:57 AM, "R-help on behalf of Doran, Harold" 
 wrote:

I'm doing some work now to learn which SQL database package is the most 
optimal for the task I am working on. There are many packages, and I'm 
reviewing the documentation on some of them now. I am seeking advice from those 
of you who might suggest a package to use for the task I am currently working 
with.

The work is currently as follows. My users currently use another tool 
to extract tables from a server, save those tables as .csv files, and then 
those csv files are read into R and stuff is done on the data in those files. 
This adds overhead that can be bypassed if users instead can directly access 
the database from within R and grab the tables they need and then those tables 
are data frames in the R session and available to do stuff.

The sequence of work (I think) I just this:

Step 1: Connect to the remote server (connection string and 
authenticate the user)
Step 2: Have a SQL query statement that grabs the tables from the 
remote server 
Step 3: Close the connection

The two packages I have narrowed my studies to are Dbplyr and RODBC, 
both of which seem to be similar. 

Any experiences out there to suggest these two packages are in fact 
right for this task, or would there be other packages that might be more 
optimal for this?

Thanks,
Harold

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Database

2018-07-26 Thread Doran, Harold
Thanks for this. I'm using the RODBC stuff now. It works well and is currently 
embedded in a shiny app. So, the entire SQL stuff is transparent to the user 
who simply interacts with the UI. It appears to be working in a local windows 
version. That is, I can successfully open the connection, do my sqlQuery, and 
save those data as objects in the R session. 

But when I run the same code on my dev server (which runs Centos 7), the code 
is breaking and it is seemingly related to the driver. It just cannot open the 
connection. That portion of my code is (with certain things blanked out for 
security):

cn <- odbcDriverConnect(connection="Driver={SQL Server Native Client 11.0};
server=1.1.1.1; 
database=xyz;
uid=*;
pwd=***;"
)

I'm doing my homework now on the right drivers that might be appropriate for 
centos, but if anyone happens to know, hints are appreciated 

Harold


-Original Message-
From: MacQueen, Don [mailto:macque...@llnl.gov] 
Sent: Thursday, July 26, 2018 11:26 AM
To: Doran, Harold ; 'r-help@r-project.org' 

Subject: Re: [R] SQL Database

From my point of view, the logic is this:

  If the external database is Oracle, use ROracle
  If the external database is MySQL, use RMySQL and similarly for other 
databases

If there is no R package specific to the database, then you drop back to RODBC 
or RJDBC. Hopefully you can get the necessary drivers or java files to support 
the database

Your steps look good (I do them all the time with Oracle and MySQL), and 
realize that you don't have to grab an entire table; you can send SQL queries 
that join tables and subset rows, etc. You can also write results back to the 
database if that's useful.

I prefer to use packages that are based on the DBI package.

-Don

--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
 
 

On 7/25/18, 7:57 AM, "R-help on behalf of Doran, Harold" 
 wrote:

I'm doing some work now to learn which SQL database package is the most 
optimal for the task I am working on. There are many packages, and I'm 
reviewing the documentation on some of them now. I am seeking advice from those 
of you who might suggest a package to use for the task I am currently working 
with.

The work is currently as follows. My users currently use another tool to 
extract tables from a server, save those tables as .csv files, and then those 
csv files are read into R and stuff is done on the data in those files. This 
adds overhead that can be bypassed if users instead can directly access the 
database from within R and grab the tables they need and then those tables are 
data frames in the R session and available to do stuff.

The sequence of work (I think) I just this:

Step 1: Connect to the remote server (connection string and authenticate 
the user)
Step 2: Have a SQL query statement that grabs the tables from the remote 
server 
Step 3: Close the connection

The two packages I have narrowed my studies to are Dbplyr and RODBC, both 
of which seem to be similar. 

Any experiences out there to suggest these two packages are in fact right 
for this task, or would there be other packages that might be more optimal for 
this?

Thanks,
Harold

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Database

2018-07-26 Thread MacQueen, Don via R-help
From my point of view, the logic is this:

  If the external database is Oracle, use ROracle
  If the external database is MySQL, use RMySQL
and similarly for other databases

If there is no R package specific to the database, then you drop back to RODBC 
or RJDBC. Hopefully you can get the necessary drivers or java files to support 
the database

Your steps look good (I do them all the time with Oracle and MySQL), and 
realize that you don't have to grab an entire table; you can send SQL queries 
that join tables and subset rows, etc. You can also write results back to the 
database if that's useful.

I prefer to use packages that are based on the DBI package.

-Don

--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
 
 

On 7/25/18, 7:57 AM, "R-help on behalf of Doran, Harold" 
 wrote:

I'm doing some work now to learn which SQL database package is the most 
optimal for the task I am working on. There are many packages, and I'm 
reviewing the documentation on some of them now. I am seeking advice from those 
of you who might suggest a package to use for the task I am currently working 
with.

The work is currently as follows. My users currently use another tool to 
extract tables from a server, save those tables as .csv files, and then those 
csv files are read into R and stuff is done on the data in those files. This 
adds overhead that can be bypassed if users instead can directly access the 
database from within R and grab the tables they need and then those tables are 
data frames in the R session and available to do stuff.

The sequence of work (I think) I just this:

Step 1: Connect to the remote server (connection string and authenticate 
the user)
Step 2: Have a SQL query statement that grabs the tables from the remote 
server 
Step 3: Close the connection

The two packages I have narrowed my studies to are Dbplyr and RODBC, both 
of which seem to be similar. 

Any experiences out there to suggest these two packages are in fact right 
for this task, or would there be other packages that might be more optimal for 
this?

Thanks,
Harold

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Database

2018-07-26 Thread Martin Maechler
> Doran, Harold 
> on Wed, 25 Jul 2018 14:57:13 + writes:

> I'm doing some work now to learn which SQL database
> package is the most optimal for the task I am working on.

Hmm... we would have a problem with optimize() and optim() if
this was

   optimal << more optimal << most optimal

:-)  ;-)

Best,
Martin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Database

2018-07-25 Thread Bert Gunter
https://rviews.rstudio.com/2017/10/18/database-queries-with-r/

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Wed, Jul 25, 2018 at 7:57 AM, Doran, Harold  wrote:

> I'm doing some work now to learn which SQL database package is the most
> optimal for the task I am working on. There are many packages, and I'm
> reviewing the documentation on some of them now. I am seeking advice from
> those of you who might suggest a package to use for the task I am currently
> working with.
>
> The work is currently as follows. My users currently use another tool to
> extract tables from a server, save those tables as .csv files, and then
> those csv files are read into R and stuff is done on the data in those
> files. This adds overhead that can be bypassed if users instead can
> directly access the database from within R and grab the tables they need
> and then those tables are data frames in the R session and available to do
> stuff.
>
> The sequence of work (I think) I just this:
>
> Step 1: Connect to the remote server (connection string and authenticate
> the user)
> Step 2: Have a SQL query statement that grabs the tables from the remote
> server
> Step 3: Close the connection
>
> The two packages I have narrowed my studies to are Dbplyr and RODBC, both
> of which seem to be similar.
>
> Any experiences out there to suggest these two packages are in fact right
> for this task, or would there be other packages that might be more optimal
> for this?
>
> Thanks,
> Harold
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL Database

2018-07-25 Thread Doran, Harold
I'm doing some work now to learn which SQL database package is the most optimal 
for the task I am working on. There are many packages, and I'm reviewing the 
documentation on some of them now. I am seeking advice from those of you who 
might suggest a package to use for the task I am currently working with.

The work is currently as follows. My users currently use another tool to 
extract tables from a server, save those tables as .csv files, and then those 
csv files are read into R and stuff is done on the data in those files. This 
adds overhead that can be bypassed if users instead can directly access the 
database from within R and grab the tables they need and then those tables are 
data frames in the R session and available to do stuff.

The sequence of work (I think) I just this:

Step 1: Connect to the remote server (connection string and authenticate the 
user)
Step 2: Have a SQL query statement that grabs the tables from the remote server 
Step 3: Close the connection

The two packages I have narrowed my studies to are Dbplyr and RODBC, both of 
which seem to be similar. 

Any experiences out there to suggest these two packages are in fact right for 
this task, or would there be other packages that might be more optimal for this?

Thanks,
Harold

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Server "float" is not handled by RODBC -- Is there a workaround?

2015-10-15 Thread Mark Dalphin
Hi Jim,

No answers over the course of 24 hours so I'll give it a shot.

First, I always work under Linux, so my answers may well be worthless
for your Windows scenario.

Second, I don't know if my workaround works as I don't actually have a
SQL Server DB using float.

Now the workaround:

I have had many problems in the past using ODBC to connect to databases.
Nothing I could nail down to a fault in that system, but just no end of
problems. Some of that, of course, is due to me generally working under
Linux.

My general workaround that has been clean is to use JDBC instead. There
have been hassles at times to set up the RJava, but recent versions of
that have installed very easily. Once RJava is in place (and under
Windows, you'll have fun setting up Java cleanly), then installation of
a JDBC jar (I use jtds from SourceForge for SQL Server) and finally
RJDBC. The generic nature of the JDBC interface is a joy to work with,
interacting with most database types very well and in a uniform manner.

So, lots of work getting JDBC up and going to see if an alternative path
into your DB gets you your data in a better format. Now you see why I
waited 24 hours to say anything at all ...

Also, it might be worth while posting on the DB specific maillist:
https://stat.ethz.ch/mailman/listinfo/r-sig-db

Hope this helps,
Mark Dalphin


On 15/10/15 07:23, jim holtman wrote:
> Here is the system I am using:
> =
>> sessionInfo()
> R version 3.2.2 (2015-08-14)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 7 x64 (build 7601) Service Pack 1
>
> locale:
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> States.1252
> [3] LC_MONETARY=English_United States.1252
> LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> other attached packages:
> [1] lubridate_1.3.3 RODBC_1.3-12
> loaded via a namespace (and not attached):
> [1] magrittr_1.5  plyr_1.8.3tools_3.2.2   memoise_0.2.1 Rcpp_0.12.1
> stringi_0.5-5 digest_0.6.8
> [8] stringr_1.0.0
> 
>
> I have data on a SQL Server that I am connecting to where some of the
> fields are defined as "float" so that the data is stored in the database as
> an IEEE 754 value.  Now when I read this is using RODBC, the data comes
> across the interface in the floating point format; I used Wireshark to
> examine the packets that were being sent.  Some of the data is also defined
> as "int" and comes across in binary.
>
> When the data is read in with
>
> df <- sqlQuery(db, "select * from mydb", as.is = TRUE)
>
> The resulting dataframe has the floating point values as 'chr' and the
> integer fields as 'int'; I would have expected the floating point fields to
> be 'num'.  Now in the "ODBC Connectivity" Vignette by Ripley there was the
> comment that "double" data values come back as type 8, but on some systems
> they may be type 6; well on SQL Server, "float" is type 6.
>
> So what appears to happen, is this data is not recognized as a floating
> point value and is therefore converted to a character.  When the data is
> made available to the R script, I then have to convert this back to
> floating point.  If I use "stringsAsFactors = FALSE" on the query, this
> conversion back to floating point will be done within the RODBC package.
> This becomes a problem when I have dataframes with several million rows and
> multiple columns of numerics is that the conversion to/from characters is
> adding time to the processing.
>
> So I was wondering is there a workaround to this problem?  Is it possible
> to add the capability to RODBC when processing SQL Server to avoid this
> conversion?  Or is there some other way around this problem?
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Server "float" is not handled by RODBC -- Is there a workaround?

2015-10-15 Thread jim holtman
Mark,

Thanks for the suggestion.  I will have to look into that option.  I assume
that if I am running on a 64-bit system, I also have to use the 64-bit
version of Java.  We have had some problems in the past because the company
standard is a 32-bit version of Java and we had to also load in the 64-bit
version to work with the XLConnect package.

I was reading the RJDBC package documention and they seem to list a lot of
methods (e.g., 'dbReadTable'), but don't say what the parameters are, or
what it returns.  Where do you find this information?  Also I notice that I
probably have to get the JDBC driver for SQL Server from Microsoft and
install that - is that correct?

I hate to start mixing approaches since we have a large number of scripts
that currently use the RODBC package, but I will try to see if the approach
you proposed do help overcome this problem.


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Thu, Oct 15, 2015 at 4:18 PM, Mark Dalphin  wrote:

> Hi Jim,
>
> No answers over the course of 24 hours so I'll give it a shot.
>
> First, I always work under Linux, so my answers may well be worthless
> for your Windows scenario.
>
> Second, I don't know if my workaround works as I don't actually have a
> SQL Server DB using float.
>
> Now the workaround:
>
> I have had many problems in the past using ODBC to connect to databases.
> Nothing I could nail down to a fault in that system, but just no end of
> problems. Some of that, of course, is due to me generally working under
> Linux.
>
> My general workaround that has been clean is to use JDBC instead. There
> have been hassles at times to set up the RJava, but recent versions of
> that have installed very easily. Once RJava is in place (and under
> Windows, you'll have fun setting up Java cleanly), then installation of
> a JDBC jar (I use jtds from SourceForge for SQL Server) and finally
> RJDBC. The generic nature of the JDBC interface is a joy to work with,
> interacting with most database types very well and in a uniform manner.
>
> So, lots of work getting JDBC up and going to see if an alternative path
> into your DB gets you your data in a better format. Now you see why I
> waited 24 hours to say anything at all ...
>
> Also, it might be worth while posting on the DB specific maillist:
> https://stat.ethz.ch/mailman/listinfo/r-sig-db
>
> Hope this helps,
> Mark Dalphin
>
>
> On 15/10/15 07:23, jim holtman wrote:
> > Here is the system I am using:
> > =
> >> sessionInfo()
> > R version 3.2.2 (2015-08-14)
> > Platform: x86_64-w64-mingw32/x64 (64-bit)
> > Running under: Windows 7 x64 (build 7601) Service Pack 1
> >
> > locale:
> > [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
> > States.1252
> > [3] LC_MONETARY=English_United States.1252
> > LC_NUMERIC=C
> > [5] LC_TIME=English_United States.1252
> >
> > attached base packages:
> > [1] stats graphics  grDevices utils datasets  methods   base
> >
> > other attached packages:
> > [1] lubridate_1.3.3 RODBC_1.3-12
> > loaded via a namespace (and not attached):
> > [1] magrittr_1.5  plyr_1.8.3tools_3.2.2   memoise_0.2.1 Rcpp_0.12.1
> > stringi_0.5-5 digest_0.6.8
> > [8] stringr_1.0.0
> > 
> >
> > I have data on a SQL Server that I am connecting to where some of the
> > fields are defined as "float" so that the data is stored in the database
> as
> > an IEEE 754 value.  Now when I read this is using RODBC, the data comes
> > across the interface in the floating point format; I used Wireshark to
> > examine the packets that were being sent.  Some of the data is also
> defined
> > as "int" and comes across in binary.
> >
> > When the data is read in with
> >
> > df <- sqlQuery(db, "select * from mydb", as.is = TRUE)
> >
> > The resulting dataframe has the floating point values as 'chr' and the
> > integer fields as 'int'; I would have expected the floating point fields
> to
> > be 'num'.  Now in the "ODBC Connectivity" Vignette by Ripley there was
> the
> > comment that "double" data values come back as type 8, but on some
> systems
> > they may be type 6; well on SQL Server, "float" is type 6.
> >
> > So what appears to happen, is this data is not recognized as a floating
> > point value and is therefore converted to a character.  When the data is
> > made available to the R script, I then have to convert this back to
> > floating point.  If I use "stringsAsFactors = FALSE" on the query, this
> > conversion back to floating point will be done within the RODBC package.
> > This becomes a problem when I have dataframes with several million rows
> and
> > multiple columns of numerics is that the conversion to/from characters is
> > adding time to the processing.
> >
> > So I was wondering is there a workaround to this problem?  Is it possible
> > to add the capability to RODBC when 

Re: [R] SQL Server "float" is not handled by RODBC -- Is there a workaround?

2015-10-15 Thread Mark Dalphin
Hi Jim,

Yes, your Java versions need to match bit width.

As for drivers, I do not use the Microsoft one; it used to be hard to
obtain for Linux users; I don't know if it still is. I currently use
"jtds" from Sourceforge:
http://jtds.sourceforge.net/

>From their site:
> jTDS is an open source 100% pure Java (type 4) JDBC 3.0 driver for
> Microsoft SQL Server (6.5, 7, 2000, 2005, 2008 and 2012) and Sybase
> Adaptive Server Enterprise (10, 11, 12 and 15).

I always found reading the DBI + DBI-specific driver (eg RPostgreSQL or
RJDBC) documentation challenging. I think it is oriented towards the
programmer who wrote it rather than the end user. It probably needs a
long, wordy vignette like Tim Bunce once wrote for the Perl DBI, but I
am not stepping up to write it (so you don't hear me complaining)! Over
the years, I've gleaned enough and experimented enough that I have
certain methods well worked out. And I have ignored much of the
functionality that has been added over the years; I am sure that the DBI
implementors have put some real gems in there as I stumble upon new ones
all the time.

The help for DBI, "help(package='DBI')", lists many methods. RJDBC is
just an intermediary and it happens to implement most of them. In fact,
I usually read the DBI drivers packages (RPosgreSQL, RJDBC) with an eye
towards find "what is not implemented or what is broken" more than
anything else.

DBIConnection-classDBIConnection class.
DBIDriver-classDBIDriver class.
DBIObject-classDBIObject class.
DBIResult-classDBIResult class.
SQLSQL quoting.
dbClearResult  Clear a result set.
dbColumnInfo   Information about result types.
dbConnect  Create a connection to a DBMS.
dbDataType Determine the SQL data type of an
object.
dbDisconnect   Disconnect (close) a connection
dbDriver   Load and unload database drivers.
dbExistsTable  Does a table exist?
dbFetchFetch records from a previously
executed query.
dbGetException Get DBMS exceptions.
dbGetInfo  Get DBMS metadata.
dbGetQuery Send query, retrieve results and
then clear result set.
dbGetRowCount  The number of rows fetched so far.
dbGetRowsAffected  The number of rows affected by
data modifying query.
dbGetStatement Get the statement associated with
a result set
dbHasCompleted Has the operation completed?
dbIsValid  Is this DBMS object still valid?
dbListConnections  List currently open connections.
dbListFields   List field names of a remote table.
dbListResults  A list of all pending results.
dbListTables   List remote tables.
dbReadTableCopy data frames to and from
database tables.
dbRemoveTable  Remove a table from the database.
dbSendQueryExecute a statement on a given
database connection.
dbiCheckCompliance Check a driver for compliance
with DBI.
make.db.names  Make R identifiers into legal SQL
identifiers.
transactions   Begin/commit/rollback SQL
transactions

In short, I usually only use a small subset of what is available:
dbConnect, dbDisconnect, dbGetQuery, dbSendQuery, dbFetch and
'transactions'

Some of the apparent richness is from things like two methods to get
table information, one detailed and one simple. And since I know my
schemas already, I seldom use either. In addition, there is ample
support for complex DB interaction, including full support for
transactions, allowing multi-table updates with security.

My usual operation is something like this:

library(RJDBC)

jdbcStr <- 'jdbc:jtds:sqlserver://myHost.myLan:myPort/myDbName'

drv <- JDBC("net.sourceforge.jtds.jdbc.Driver",
"/my/long/path/to/my/JDBC_Drivers/jtds/jtds-1.3.0.jar")

conn <- dbConnect(drv, jdbcStr, user='mark', password='*')

## See what tables are present (DB exploration; not used in production)
tab <- dbGetTables(conn)

## A typical query if I am to iterate through
rs <- dbSendQuery(conn, 'SELECT TOP 10 * FROM Orders;')
## followed by dbFetch calls

## A more typical query if I can grab the whole thing at once
rs <- dbGetQuery(conn, 'SELECT * FROM Orders;')
print(rs)

dbDisconnect(conn)

Hope this helps; I might be able to help with specific examples for some
of the DBI methods, if you need it. I suspect that what I have written
above will cover most "query SQL", though not the 

[R] SQL Server "float" is not handled by RODBC -- Is there a workaround?

2015-10-14 Thread jim holtman
Here is the system I am using:
=
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252
[3] LC_MONETARY=English_United States.1252
LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] lubridate_1.3.3 RODBC_1.3-12
loaded via a namespace (and not attached):
[1] magrittr_1.5  plyr_1.8.3tools_3.2.2   memoise_0.2.1 Rcpp_0.12.1
stringi_0.5-5 digest_0.6.8
[8] stringr_1.0.0


I have data on a SQL Server that I am connecting to where some of the
fields are defined as "float" so that the data is stored in the database as
an IEEE 754 value.  Now when I read this is using RODBC, the data comes
across the interface in the floating point format; I used Wireshark to
examine the packets that were being sent.  Some of the data is also defined
as "int" and comes across in binary.

When the data is read in with

df <- sqlQuery(db, "select * from mydb", as.is = TRUE)

The resulting dataframe has the floating point values as 'chr' and the
integer fields as 'int'; I would have expected the floating point fields to
be 'num'.  Now in the "ODBC Connectivity" Vignette by Ripley there was the
comment that "double" data values come back as type 8, but on some systems
they may be type 6; well on SQL Server, "float" is type 6.

So what appears to happen, is this data is not recognized as a floating
point value and is therefore converted to a character.  When the data is
made available to the R script, I then have to convert this back to
floating point.  If I use "stringsAsFactors = FALSE" on the query, this
conversion back to floating point will be done within the RODBC package.
This becomes a problem when I have dataframes with several million rows and
multiple columns of numerics is that the conversion to/from characters is
adding time to the processing.

So I was wondering is there a workaround to this problem?  Is it possible
to add the capability to RODBC when processing SQL Server to avoid this
conversion?  Or is there some other way around this problem?

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL Requests Templating

2014-10-29 Thread Alexander Myltsev
Hi,

I am new to R.

What I'd like to know is how to empower sqldf with templates like
https://www.playframework.com/documentation/2.3.x/ScalaAnorm does? What
does seasoned R-hacker use for this purpose: dedicated R-package that I am
not aware of, or kind of format string, or something else?

A.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-21 Thread Dr Eberhard Lisse
So,

some feedback.

Have installed MariaDB 10.0.10 on the Linux box. That speeded things up.
Changed from InnoDB/XtraDb to Aria. That speeded loading of the data up.
Have installed MariaDB on the iMac. That speeded things up more.
Tried to tune MariadDB's config. Didn't speed things up much, but for
the query buffer.
Figured out Replication (from the linux box to the iMac). This slowed
loading down somewhat.
Played with the SQL. Speeded things up significantly.
Played with the inxdexes. Did not speed things up much.
Found what I could do in data.table that was faster than SQL and did
that. Obvious increase in speed.

My R processing time came down from 35 to 6 1/2 minutes.


Removed all large tables before saving (and once the raw data was no
longer required). That reduced RData from 150MB to 7KB.

Pushed the table and image generation into a second R file. This takes 4
seconds. The corresponding LyX/LaTeX/Beamer/KnitR runs in 12 seconds.

Installed RStudio. Nice.

Adding new SQL queries adds between 30 and 90 seconds in the input R
file, next to nothing to the presentation generation.

I could not care lass how long the input takes, even hours, as long as I
can save the analysis results and not the data into the RData.

el

PS: Ordered a MacPro :-)-O. Will report back.

on 2014-05-06, 15:40 Peter Crowther said the following:
 The dataset is not large by database standards.  Even in mySQL - not known
 for its speed at multi-row querying - the queries you describe should
 complete within a few seconds on even moderately recent hardware if your
 indexes are reasonable.
 
 What are your performance criteria for processing these queries, and how
 have you / your team optimised the relational database storage?
 
 Cheers,
 
 - Peter
 --
 Peter Crowther, Director, Melandra Limited
 
 
 On 6 May 2014 15:32, Dr Eberhard Lisse e...@lisse.na wrote:
 
 Exactly,
 
 which is why I am looking for something faster :-)-O
 
 el
 
 on 2014-05-06, 15:21 David R Forrest said the following:
 It sounds as if your underlying MySQL database is too slow for your
 purposes.  Whatever you layer on top of it will be constrained by
 the underlying database.  To speed up the process significantly,
 you may need to do work on the database backend part of the
 process.

 Dave

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
   [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Eberhard Lisse
Thank you.

My requirements are that simple. One table, 11 fields, of which 3 are
interesting, 30 Million records, growing daily by between 30.

And, yes I have spent an enormous amount of time reading these things,
but for someone not dealing with this professionally and/or on a daily
basis, the documents don't help much.

el


on 2014-05-04, 05:26 Jeff Newmiller said the following:
 ?table
 ?aggregate
 
 Also, packages plyr, data.table, and dplyr.  You might consider
 reading [1], but if your interests are really as simple as your
 examples then the table function should be sufficient.  That
 function is discussed in the Introduction to R document that you
 really should have read before posting here.
 
 [1] http://www.jstatsoft.org/v40/i01/
[...]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Jeff Newmiller
In what format is this growing data stored? CSV? SQL? Log textfile? You say 
you don't want to use sqldf, but you haven't said what you do want to use.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On May 6, 2014 1:16:12 AM PDT, Eberhard Lisse nos...@lisse.na wrote:
Thank you.

My requirements are that simple. One table, 11 fields, of which 3 are
interesting, 30 Million records, growing daily by between 30.

And, yes I have spent an enormous amount of time reading these things,
but for someone not dealing with this professionally and/or on a daily
basis, the documents don't help much.

el


on 2014-05-04, 05:26 Jeff Newmiller said the following:
 ?table
 ?aggregate
 
 Also, packages plyr, data.table, and dplyr.  You might consider
 reading [1], but if your interests are really as simple as your
 examples then the table function should be sufficient.  That
 function is discussed in the Introduction to R document that you
 really should have read before posting here.
 
 [1] http://www.jstatsoft.org/v40/i01/
[...]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Dr Eberhard Lisse
Jeff

It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
dataframe it saves to 180MB. I work from the dataframe.

But, it's not only a size issue it's also a speed issue and hence I
don't care what I am going to use, as long as it is fast.

sqldf is easy to understand for me but it takes ages.  If
alternatives were roughly similar in speed I would remain with
sqldf.

dplyr sounds faster, and promising, but the intrinsic stuff is
way beyond me (elderly Gynaecologist) on the learning curve...

el

on 2014-05-06, 09:41 Jeff Newmiller said the following:
 In what format is this growing data stored?  CSV? SQL? Log
 textfile?  You say you don't want to use sqldf, but you haven't
 said what you do want to use.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Carlos Ortega
Hi,

Yes dplyr syntax is quite equivalent to SQL, although it is faster.
Another alternative you could consider is to use *data.table* which has a
syntax very similar to the way you select subset within a data.frame and in
terms of performance is faster (a bit) than sqldf.

You can get some idea of how to work with it here:

http://stackoverflow.com/questions/1727772/quickly-reading-very-large-tables-as-dataframes-in-r

Regards,
Carlos Ortega
www.qualityexcellence.es





2014-05-06 11:12 GMT+02:00 Dr Eberhard Lisse e...@lisse.na:

 Jeff

 It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
 dataframe it saves to 180MB. I work from the dataframe.

 But, it's not only a size issue it's also a speed issue and hence I
 don't care what I am going to use, as long as it is fast.

 sqldf is easy to understand for me but it takes ages.  If
 alternatives were roughly similar in speed I would remain with
 sqldf.

 dplyr sounds faster, and promising, but the intrinsic stuff is
 way beyond me (elderly Gynaecologist) on the learning curve...

 el

 on 2014-05-06, 09:41 Jeff Newmiller said the following:
  In what format is this growing data stored?  CSV? SQL? Log
  textfile?  You say you don't want to use sqldf, but you haven't
  said what you do want to use.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread David McPearson
On Tue, 6 May 2014 10:12:50 +0100 Dr Eberhard Lisse e...@lisse.na wrote

 Jeff
 
 It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
 dataframe it saves to 180MB. I work from the dataframe.
 
 But, it's not only a size issue it's also a speed issue and hence I
 don't care what I am going to use, as long as it is fast.
 
 sqldf is easy to understand for me but it takes ages.  If
 alternatives were roughly similar in speed I would remain with
 sqldf.
 
 dplyr sounds faster, and promising, but the intrinsic stuff is
 way beyond me (elderly Gynaecologist) on the learning curve...
 
 el
 
 on 2014-05-06, 09:41 Jeff Newmiller said the following:
  In what format is this growing data stored?  CSV? SQL? Log
  textfile?  You say you don't want to use sqldf, but you haven't
  said what you do want to use.
 


It seems like you are trying to extract a (relatively) small data set from a
much larger SQL databaseWhy not do the SQL stiff in the database and the
analysis *statsm graphics...) in R? Maybe use a make table query to grab the
data of interest, and then import the whole table into R for the analysis?
(Disclaimer: my ignorance of SQL is not far off total)

HTH
D.


South Africas premier free email service - www.webmail.co.za 

Cheapest Insurance Quotes!
https://www.outsurance.co.za/insurance-quote/personal/?source=msncr=Postit14_468x60_gifcid=322

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Dr Eberhard Lisse
David,

this is quite slow :-)-O

el

on 2014-05-06, 10:55 David McPearson said the following:
[...]
 It seems like you are trying to extract a (relatively) small data set from a
 much larger SQL databaseWhy not do the SQL stiff in the database and the
 analysis *statsm graphics...) in R? Maybe use a make table query to grab the
 data of interest, and then import the whole table into R for the analysis?
 (Disclaimer: my ignorance of SQL is not far off total)
 
 HTH
 D.
[...]
-- 
Dr. Eberhard W. Lisse  \/ Obstetrician  Gynaecologist (Saar)
e...@lisse.na/ * |   Telephone: +264 81 124 6733 (cell)
PO Box 8421 \ /
Bachbrecht, Namibia ;/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Gabor Grothendieck
On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse e...@lisse.na wrote:
 Jeff

 It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
 dataframe it saves to 180MB. I work from the dataframe.

 But, it's not only a size issue it's also a speed issue and hence I
 don't care what I am going to use, as long as it is fast.

 sqldf is easy to understand for me but it takes ages.  If
 alternatives were roughly similar in speed I would remain with
 sqldf.

 dplyr sounds faster, and promising, but the intrinsic stuff is
 way beyond me (elderly Gynaecologist) on the learning curve...

You can create indices in sqldf and that can speed up processing
substantially for certain operations.   See examples 4h and 4i on the
sqldf home page: http://sqldf.googlecode.com. Also note that sqldf
supports not only the default SQLite backend but also MySQL, h2 and
postgresql.  See ?sqldf for info on using sqldf with MySQL and the
others.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Dr Eberhard Lisse
Thanks,

tried all of that, too slow.

el

on 2014-05-06, 12:00 Gabor Grothendieck said the following:
 On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse e...@lisse.na wrote:
 Jeff

 It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
 dataframe it saves to 180MB. I work from the dataframe.

 But, it's not only a size issue it's also a speed issue and hence
 I don't care what I am going to use, as long as it is fast.

 sqldf is easy to understand for me but it takes ages.  If
 alternatives were roughly similar in speed I would remain with
 sqldf.

 dplyr sounds faster, and promising, but the intrinsic stuff is
 way beyond me (elderly Gynaecologist) on the learning curve...
 
 You can create indices in sqldf and that can speed up processing
 substantially for certain operations.  See examples 4h and 4i on
 the sqldf home page: http://sqldf.googlecode.com.  Also note that
 sqldf supports not only the default SQLite backend but also MySQL,
 h2 and postgresql.  See ?sqldf for info on using sqldf with MySQL
 and the others.
 

-- 
Dr. Eberhard W. Lisse  \/ Obstetrician  Gynaecologist (Saar)
e...@lisse.na/ * |   Telephone: +264 81 124 6733 (cell)
PO Box 8421 \ /
Bachbrecht, Namibia ;/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread David R Forrest
It sounds as if your underlying MySQL database is too slow for your purposes.  
Whatever you layer on top of it will be constrained by the underlying database. 
 To speed up the process significantly, you may need to do work on the database 
backend part of the process.

Dave


On May 6, 2014, at 7:08 AM, Dr Eberhard Lisse e...@lisse.na wrote:

 Thanks,
 
 tried all of that, too slow.
 
 el
 
 on 2014-05-06, 12:00 Gabor Grothendieck said the following:
 On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse e...@lisse.na wrote:
 Jeff
 
 It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a
 dataframe it saves to 180MB. I work from the dataframe.
 
 But, it's not only a size issue it's also a speed issue and hence
 I don't care what I am going to use, as long as it is fast.
 
 sqldf is easy to understand for me but it takes ages.  If
 alternatives were roughly similar in speed I would remain with
 sqldf.
 
 dplyr sounds faster, and promising, but the intrinsic stuff is
 way beyond me (elderly Gynaecologist) on the learning curve...
 
 You can create indices in sqldf and that can speed up processing
 substantially for certain operations.  See examples 4h and 4i on
 the sqldf home page: http://sqldf.googlecode.com.  Also note that
 sqldf supports not only the default SQLite backend but also MySQL,
 h2 and postgresql.  See ?sqldf for info on using sqldf with MySQL
 and the others.
 
 
 -- 
 Dr. Eberhard W. Lisse  \/ Obstetrician  Gynaecologist (Saar)
 e...@lisse.na/ * |   Telephone: +264 81 124 6733 (cell)
 PO Box 8421 \ /
 Bachbrecht, Namibia ;/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Dr. David Forrest
d...@vims.edu
804-684-7900w
757-968-5509h
804-413-7125c
#240 Andrews Hall
Virginia Institute of Marine Science
Route 1208, Greate Road
PO Box 1346
Gloucester Point, VA, 23062-1346












signature.asc
Description: Message signed with OpenPGP using GPGMail
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Dr Eberhard Lisse
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Exactly,

which is why I am looking for something faster :-)-O

el

on 2014-05-06, 15:21 David R Forrest said the following:
 It sounds as if your underlying MySQL database is too slow for your
 purposes.  Whatever you layer on top of it will be constrained by
 the underlying database.  To speed up the process significantly,
 you may need to do work on the database backend part of the
 process.
 
 Dave
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA
KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn
clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW
6aytsFtPZTI=
=zepY
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Peter Crowther
The dataset is not large by database standards.  Even in mySQL - not known
for its speed at multi-row querying - the queries you describe should
complete within a few seconds on even moderately recent hardware if your
indexes are reasonable.

What are your performance criteria for processing these queries, and how
have you / your team optimised the relational database storage?

Cheers,

- Peter
--
Peter Crowther, Director, Melandra Limited


On 6 May 2014 15:32, Dr Eberhard Lisse e...@lisse.na wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Exactly,

 which is why I am looking for something faster :-)-O

 el

 on 2014-05-06, 15:21 David R Forrest said the following:
  It sounds as if your underlying MySQL database is too slow for your
  purposes.  Whatever you layer on top of it will be constrained by
  the underlying database.  To speed up the process significantly,
  you may need to do work on the database backend part of the
  process.
 
  Dave
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.12 (Darwin)
 Comment: GPGTools - http://gpgtools.org
 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

 iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA
 KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn
 clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW
 6aytsFtPZTI=
 =zepY
 -END PGP SIGNATURE-

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Bert Gunter
I believe this discussion should be taken offlist as it no longer
seems to be concerned with R.

-- Bert Gunter

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Tue, May 6, 2014 at 7:40 AM, Peter Crowther
peter.crowt...@melandra.com wrote:
 The dataset is not large by database standards.  Even in mySQL - not known
 for its speed at multi-row querying - the queries you describe should
 complete within a few seconds on even moderately recent hardware if your
 indexes are reasonable.

 What are your performance criteria for processing these queries, and how
 have you / your team optimised the relational database storage?

 Cheers,

 - Peter
 --
 Peter Crowther, Director, Melandra Limited


 On 6 May 2014 15:32, Dr Eberhard Lisse e...@lisse.na wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Exactly,

 which is why I am looking for something faster :-)-O

 el

 on 2014-05-06, 15:21 David R Forrest said the following:
  It sounds as if your underlying MySQL database is too slow for your
  purposes.  Whatever you layer on top of it will be constrained by
  the underlying database.  To speed up the process significantly,
  you may need to do work on the database backend part of the
  process.
 
  Dave
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.12 (Darwin)
 Comment: GPGTools - http://gpgtools.org
 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

 iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA
 KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn
 clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW
 6aytsFtPZTI=
 =zepY
 -END PGP SIGNATURE-

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-06 Thread Thomas Lumley
On Wed, May 7, 2014 at 2:21 AM, David R Forrest d...@vims.edu wrote:
 It sounds as if your underlying MySQL database is too slow for your purposes. 
  Whatever you layer on top of it will be constrained by the underlying 
 database.  To speed up the process significantly, you may need to do work on 
 the database backend part of the process.


You might try MonetDB and its R interface -- it is fast for
aggregation operations, and either the current version or the upcoming
version has dplyr support.

-thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-05 Thread David Winsemius

On May 5, 2014, at 11:44 AM, Dr Eberhard Lisse wrote:

 I do not wish to prolong this metadiscussion but I remain confused
 by your advice:
 
 1) You don't understand what I asked (ie would have to parse two
 simple SQL statements)

Correct ... at least for me. I could have guessed at what that statement might 
have meant, but why should I need to guess? Why not use a shared 
naturallanguage rather than restricting your audience to the more limited group 
that understands both languages?

 2) The Original Post is understood enough, however, to point me to
 the Introduction to R (where I have not found something to help
 me)

That means my guess would have been wrong, since like Jeff Newmiller, I thought 
a simple call to `table` would have succeeded. (Section 5.10 although the 
desire for ordering suggested by my guess regarding 

 
 3) My name is not Pete.

I'm actually not sure who Pete was. It's a local expression of astonishment 
directed, not at you, but at Satish. That was a prelude to my effort explain to 
Satish why the other respondents to the list have not seen fit to be more 
expansive in their responses. I thought Satish's comment was gratuitous (and 
likewise unhelpful).

 
 If you don't want to help me, don't.

Several people are trying to help. You are remaining obdurate in failing to 
explain what is desired in natural language or posting an example in R code 
with desired output, as well as in failing to heed multiple other bits of 
advice in the Posting Guide. The accepted practice in responses is to include 
any context that might further the conversation. To my mind that would have 
required that you include the original request:

 How do I do something like this without using sqldf?
 
 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')
 
 or
 
 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

How does the earlier suggestion to look at the 'table' function fail to address 
the first alternative? ( It appears it might satisfy the second one as well.)

-- 
David.


  Nobody is forcing you to reply.
 
 el
 
 On 2014-05-04, 06:56 , David Winsemius wrote:
 
 On May 3, 2014, at 9:10 PM, Satish Anupindi Rao wrote:
 
 
  By making the effort to learn R??  very constructive and not
 condescending at all.  We, lesser beings, are indebted to you,
 sir.
 
 For Pete's sake.  The OP didn't even express his original request
 in natural language or offer a working example.  Those of us who
 are not regular SQL users would have needed to parse out the SQL
 code in order to figure out what was intended.  (My guess is that
 it would have been quite easy to solve if those were what were
 offered.)  But making the effort to divine the intent didn't seem
 justified by the level of courtesy offered by the questioner.
 
 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-05 Thread Gabor Grothendieck
On Fri, May 2, 2014 at 5:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:
 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)


In the examples section at the bottom of ?sqldf are a number of SQL
statements and the corresponding R statements.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Dr Eberhard W Lisse
Thank you very much, Mr Arkell.

el

On 2014-05-03, 07:11 , Bert Gunter wrote:
 By making the effort to learn R?
 
 See e.g. the Introduction to R tutorial that ships with R.
 
 -- Bert
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374
 
 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 H. Gilbert Welch
 
 
 
 
 On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:
 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

 greetings, el

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Dr Eberhard W Lisse
Thanks,

will try to figure this out :-)-O

el

On 2014-05-03, 06:40 , Carlos Ortega wrote:
 Hi,
 
 With the new package dplyr you can create equivalent SQL sintaxt
 queries like the one you need.
 You can find examples of how to apply it here:
 
 http://martinsbioblogg.wordpress.com/2014/03/26/using-r-quickly-calculating-summary-statistics-with-dplyr/
 
 http://martinsbioblogg.wordpress.com/2014/03/27/more-fun-with-and/
 
 Regards,
 Carlos.
 
 
 
 
 2014-05-02 23:23 GMT+02:00 Dr Eberhard Lisse nos...@lisse.na
 mailto:nos...@lisse.na:
 
 Hi,
 
 How do I do something like this without using sqldf?
 
 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')
 
 or
 
 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)
 
 greetings, el
 
 __
 R-help@r-project.org mailto:R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Saludos,
 Carlos Ortega
 www.qualityexcellence.es http://www.qualityexcellence.es

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Rolf Turner


On 04/05/14 00:05, Dr Eberhard W Lisse wrote:


Thank you very much, Mr Arkell.


I don't get it.  Can anyone explain the (joke? allusion?) ?

cheers,

Rolf Turner


On 2014-05-03, 07:11 , Bert Gunter wrote:

By making the effort to learn R?

See e.g. the Introduction to R tutorial that ships with R.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:

Hi,

How do I do something like this without using sqldf?

a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

or

e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Sarah Goslee
On Sat, May 3, 2014 at 5:42 PM, Rolf Turner r.tur...@auckland.ac.nz wrote:

 On 04/05/14 00:05, Dr Eberhard W Lisse wrote:

 Thank you very much, Mr Arkell.


 I don't get it.  Can anyone explain the (joke? allusion?) ?

I believe it's a moderately offensive reply from someone who feels
unfairly dismissed, derived from British pop culture. But someone
who's actually British could better explain, I'm sure.

Personally, I'm not sure how much work someone who appears to have not
read the posting guide should really expect the list to do on his
behalf. But snarky replies to reasonable requests to read the
documentation are easier than doing one's own work.

Sarah

 cheers,

 Rolf Turner

 On 2014-05-03, 07:11 , Bert Gunter wrote:

 By making the effort to learn R?

 See e.g. the Introduction to R tutorial that ships with R.

 -- Bert

 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374

 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 H. Gilbert Welch




 On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na
 wrote:

 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Dr Eberhard W Lisse
Google Pressdram :-)-O

el

On 2014-05-03, 23:42 , Rolf Turner wrote:
 On 04/05/14 00:05, Dr Eberhard W Lisse wrote:

 Thank you very much, Mr Arkell.

 I don't get it.  Can anyone explain the (joke? allusion?) ?

 cheers,

 Rolf Turner


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Rolf Turner

On 04/05/14 10:16, Dr Eberhard W Lisse wrote:

Google Pressdram :-)-O

el

On 2014-05-03, 23:42 , Rolf Turner wrote:

On 04/05/14 00:05, Dr Eberhard W Lisse wrote:


Thank you very much, Mr Arkell.


I don't get it.  Can anyone explain the (joke? allusion?) ?


Thank you.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Rolf Turner

On 04/05/14 09:58, Sarah Goslee wrote:

SNIP


Personally, I'm not sure how much work someone who appears to have not
read the posting guide should really expect the list to do on his
behalf. But snarky replies to reasonable requests to read the
documentation are easier than doing one's own work.


Well put.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Satish Anupindi Rao

 By making the effort to learn R?? very constructive and not condescending at 
all. We, lesser beings, are indebted to you, sir.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Bert Gunter
Sent: Saturday, May 03, 2014 1:12 AM
To: Dr Eberhard Lisse
Cc: r
Subject: Re: [R] SQL vs R

By making the effort to learn R?

See e.g. the Introduction to R tutorial that ships with R.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge is 
certainly not wisdom.
H. Gilbert Welch




On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:
 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

 greetings, el

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread Jeff Newmiller
?table
?aggregate

Also, packages plyr, data.table, and dplyr. You might consider reading [1], but 
if your interests are really as simple as your examples then the table function 
should be sufficient. That function is discussed in the Introduction to R 
document that you really should have read before posting here.

[1] http://www.jstatsoft.org/v40/i01/

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On May 2, 2014 2:23:13 PM PDT, Dr Eberhard Lisse nos...@lisse.na wrote:
Hi,

How do I do something like this without using sqldf?

a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

or

e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

greetings, el

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-03 Thread David Winsemius

On May 3, 2014, at 9:10 PM, Satish Anupindi Rao wrote:

 
  By making the effort to learn R?? very constructive and not condescending 
 at all. We, lesser beings, are indebted to you, sir.

For Pete's sake. The OP didn't even express his original request in natural 
language or offer a working example. Those of us who are not regular SQL users 
would have needed to parse out the SQL code in order to figure out what was 
intended. (My guess is that it would have been quite easy to solve if those 
were what were offered.)  But making the effort to divine the intent didn't 
seem justified by the level of courtesy offered by the questioner.

-- 
David.

 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Bert Gunter
 Sent: Saturday, May 03, 2014 1:12 AM
 To: Dr Eberhard Lisse
 Cc: r
 Subject: Re: [R] SQL vs R
 
 By making the effort to learn R?
 
 See e.g. the Introduction to R tutorial that ships with R.
 
 -- Bert
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374
 
 Data is not information. Information is not knowledge. And knowledge is 
 certainly not wisdom.
 H. Gilbert Welch
 
 
 
 
 On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:
 Hi,
 
 How do I do something like this without using sqldf?
 
 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')
 
 or
 
 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)
 
 greetings, el
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL vs R

2014-05-02 Thread Dr Eberhard Lisse
Hi,

How do I do something like this without using sqldf?

a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

or

e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

greetings, el

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-02 Thread Carlos Ortega
Hi,

With the new package dplyr you can create equivalent SQL sintaxt queries
like the one you need.
You can find examples of how to apply it here:

http://martinsbioblogg.wordpress.com/2014/03/26/using-r-quickly-calculating-summary-statistics-with-dplyr/

http://martinsbioblogg.wordpress.com/2014/03/27/more-fun-with-and/

Regards,
Carlos.




2014-05-02 23:23 GMT+02:00 Dr Eberhard Lisse nos...@lisse.na:

 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

 greetings, el

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL vs R

2014-05-02 Thread Bert Gunter
By making the effort to learn R?

See e.g. the Introduction to R tutorial that ships with R.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
H. Gilbert Welch




On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote:
 Hi,

 How do I do something like this without using sqldf?

 a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd')

 or

 e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f)

 greetings, el

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL queries in R

2013-06-07 Thread Sneha Bishnoi
Hey all!

I am trying to select a bunch of id's  (data type -character) from a table
and store them in a variable in R
But when i do this, it automatically truncates the leading zero's in id's
even though they are of character type.

code is :-

myconn-odbcConnect(testdata)
sql.select-paste(select UNIT_ID from UNITS where (UNIT_TYPE=',unit,'
and COMMUNITY=',property,'),sep=)
 unit_ids-sqlQuery(myconn,sql.select)
print(unit_ids)
is there anyway i can retain the UNIT_ID's as they are.

Thanks!
Sneha

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL queries in R

2013-06-07 Thread andrija djurovic
?sqlQuery
as.is - argument
Andrija
On Jun 7, 2013 9:10 PM, Sneha Bishnoi sneha.bish...@gmail.com wrote:

 Hey all!

 I am trying to select a bunch of id's  (data type -character) from a table
 and store them in a variable in R
 But when i do this, it automatically truncates the leading zero's in id's
 even though they are of character type.

 code is :-

 myconn-odbcConnect(testdata)
 sql.select-paste(select UNIT_ID from UNITS where (UNIT_TYPE=',unit,'
 and COMMUNITY=',property,'),sep=)
  unit_ids-sqlQuery(myconn,sql.select)
 print(unit_ids)
 is there anyway i can retain the UNIT_ID's as they are.

 Thanks!
 Sneha

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL queries in R

2013-06-07 Thread Sneha Bishnoi
tried as.is ,gives an error,

[1] 01000 10054 [Microsoft][ODBC SQL Server
Driver][DBNETLIB]ConnectionWrite (send()).
[2] [RODBC] ERROR: Could not SQLExecDirect 'select UNIT_ID from UNITS
where (UNIT_TYPE='1X1' and COMMUNITY='SAN1193')'


On Fri, Jun 7, 2013 at 3:21 PM, andrija djurovic djandr...@gmail.comwrote:

 ?sqlQuery
 as.is - argument
 Andrija
 On Jun 7, 2013 9:10 PM, Sneha Bishnoi sneha.bish...@gmail.com wrote:

 Hey all!

 I am trying to select a bunch of id's  (data type -character) from a table
 and store them in a variable in R
 But when i do this, it automatically truncates the leading zero's in id's
 even though they are of character type.

 code is :-

 myconn-odbcConnect(testdata)
 sql.select-paste(select UNIT_ID from UNITS where (UNIT_TYPE=',unit,'
 and COMMUNITY=',property,'),sep=)
  unit_ids-sqlQuery(myconn,sql.select)
 print(unit_ids)
 is there anyway i can retain the UNIT_ID's as they are.

 Thanks!
 Sneha

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Sneha Bishnoi
+14047235469
H. Milton Stewart School of Industrial   Systems Engineering
Georgia Tech

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL queries in R

2013-06-07 Thread andrija djurovic
myconn-odbcConnect(testdata)
sql.select-paste(select UNIT_ID from UNITS where (UNIT_TYPE=',unit,'
and COMMUNITY=',property,'),sep=)

unit_ids-sqlQuery(myconn,sql.select,as.is=TRUE)

This should works if myconn and sql.select are defined properly

Andrija
On Jun 7, 2013 9:58 PM, Sneha Bishnoi sneha.bish...@gmail.com wrote:

 tried as.is ,gives an error,

 [1] 01000 10054 [Microsoft][ODBC SQL Server
 Driver][DBNETLIB]ConnectionWrite (send()).
 [2] [RODBC] ERROR: Could not SQLExecDirect 'select UNIT_ID from UNITS
 where (UNIT_TYPE='1X1' and COMMUNITY='SAN1193')'


 On Fri, Jun 7, 2013 at 3:21 PM, andrija djurovic djandr...@gmail.comwrote:

 ?sqlQuery
 as.is - argument
 Andrija
 On Jun 7, 2013 9:10 PM, Sneha Bishnoi sneha.bish...@gmail.com wrote:

 Hey all!

 I am trying to select a bunch of id's  (data type -character) from a
 table
 and store them in a variable in R
 But when i do this, it automatically truncates the leading zero's in id's
 even though they are of character type.

 code is :-

 myconn-odbcConnect(testdata)
 sql.select-paste(select UNIT_ID from UNITS where (UNIT_TYPE=',unit,'
 and COMMUNITY=',property,'),sep=)
  unit_ids-sqlQuery(myconn,sql.select)
 print(unit_ids)
 is there anyway i can retain the UNIT_ID's as they are.

 Thanks!
 Sneha

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Sneha Bishnoi
 +14047235469
 H. Milton Stewart School of Industrial   Systems Engineering
 Georgia Tech


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [SQL]

2013-04-25 Thread Ignacio Martinez
Hi,

The data for my new project are in a bunch of .sql files, instead of the
clasic csv files that I'm used to work with.

Could someone explain to me how to read these files into R?

Thanks,

-Ignacio

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [SQL]

2013-04-25 Thread MacQueen, Don
With so little information, one can only guess.

I would guess your .sql files contain scripts written in the SQL
language, in which case you will need some local database support to help
you run those scripts in whatever database has the data. Perhaps the
scripts will output csv files.

If it turns out that you need run the SQL scripts from within R, then
I'd suggest asking for help on R-sig-db.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 4/25/13 9:09 AM, Ignacio Martinez ignaci...@gmail.com wrote:

Hi,

The data for my new project are in a bunch of .sql files, instead of the
clasic csv files that I'm used to work with.

Could someone explain to me how to read these files into R?

Thanks,

-Ignacio

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [SQL]

2013-04-25 Thread Jeff Newmiller
The format of files with a SQL extension are not necessarily well- defined. In 
most cases I have found, they are text files that contain SQL Data Definition 
Language statements (CREATE TABLE) and possibly Data Manipulation Language 
statements (INSERT INTO). You may be able to extract the portions of the files 
that contain data using read.csv and judicious use of the skip and nrow 
arguments, but you will have to first become familiar with the contents of the 
file using a text editor. If they are binary files, you may need to consult 
with the source of the data to identify the format used more precisely.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Ignacio Martinez ignaci...@gmail.com wrote:

Hi,

The data for my new project are in a bunch of .sql files, instead of
the
clasic csv files that I'm used to work with.

Could someone explain to me how to read these files into R?

Thanks,

-Ignacio

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL via RODBC

2012-10-26 Thread Dr. Alireza Zolfaghari
Hi list,
Is there a way to use sqlQuery function where there is a sql file (ie.
sample.sql)? I just want to mention that in my sql file there are some
comment lines (starting with --). This means that if I paste all the lines
in the sql file, I'll come up with a long string that most part of it is
commneted (after commnet sign --).

thanks
Arvin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL via RODBC

2012-10-26 Thread Jeff Newmiller
That would be very implementation-specific, and ODBC is generic in its own way. 
No, you must run one query at a time in general, and deal with the results 
using the procedural language.

Keep in mind that you have to pick a back-end database to work with, and for 
creating the database you may have tools available that can directly process 
that sql file once.  Then you can use a small number of queries from within R 
to retrieve data at will. If the file contains data queries whose output you 
want to process in R, they have to be handled one-by-one.

---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Dr. Alireza Zolfaghari ali.zolfagh...@gmail.com wrote:

Hi list,
Is there a way to use sqlQuery function where there is a sql file (ie.
sample.sql)? I just want to mention that in my sql file there are some
comment lines (starting with --). This means that if I paste all the
lines
in the sql file, I'll come up with a long string that most part of it
is
commneted (after commnet sign --).

thanks
Arvin

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL query with Multicore option on R -linux

2012-09-21 Thread Heramb Gadgil
This is what I think;

Hive internally distributes the data. If you have set up Hive on single
core it will fetch the query results from that core. If you have multi-core
system on which you have setup the Hive, it will search all the cores for
the query submitted and results would be compiled together for final
output. The query would remain the same no matter whther you have a single
core or multiple cores or a cluster of machines.

Please correct me if I am going the wrong way.

Best,
Heramb

On Thu, Sep 20, 2012 at 11:34 PM, Madana_Babu madana_b...@infosys.comwrote:

 Hi all,

 I have the following sql query that I am executing on a machine with single
 core. I want to know how can I execute the same sqery on a maching that is
 running with 4 cores. Please provide me the code.

 NEW_TABLE - rhive.query(SELECT A, B, COUNT(C) FROM TABLE_A WHERE
 A='01-01-2012')

 Also let me know how can I leverage only 2 / 3 cores of the machine.

 Regards,
 Madana



 --
 View this message in context:
 http://r.789695.n4.nabble.com/SQL-query-with-Multicore-option-on-R-linux-tp4643771.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL query with Multicore option on R -linux

2012-09-20 Thread Madana_Babu
Hi all, 

I have the following sql query that I am executing on a machine with single
core. I want to know how can I execute the same sqery on a maching that is
running with 4 cores. Please provide me the code. 

NEW_TABLE - rhive.query(SELECT A, B, COUNT(C) FROM TABLE_A WHERE
A='01-01-2012') 

Also let me know how can I leverage only 2 / 3 cores of the machine. 

Regards, 
Madana



--
View this message in context: 
http://r.789695.n4.nabble.com/SQL-query-with-Multicore-option-on-R-linux-tp4643771.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL query with multicore option

2012-09-14 Thread Madana_Babu
Hi all,

I have the following sql query that I am executing on a machine with single
core. I want to know how can I execute the same sqery on a maching that is
running with 4 cores. Please provide me the code.

NEW_TABLE - rhive.query(SELECT A, B, COUNT(C) FROM TABLE_A WHERE
A='01-01-2012')

Also let me know how can I leverage only 2 / 3 cores of the machine.

Regards,
Madana



--
View this message in context: 
http://r.789695.n4.nabble.com/SQL-query-with-multicore-option-tp4643197.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL select ... where R variable

2011-12-14 Thread agent dunham
Thank you, 

I guess it didn't work for me, maybe is not possible?

I've tried: 

 con- odbcDriverConnect(Driver=SQL Server;
 Server=...\\...;Database=...;Uid=...;Pwd=... ;)

 v1=sqlQuery(con, select v1 from sqltable where v3 =cte and v2 in (select
 v2 from R_dataframe) order by (select v2 from R_dataframe))

 head(rbind(R_dataframe$v2, v1))
   [,1] 
  
   1251   
  
v1 42S02 208 [Microsoft][ODBC SQL Server Driver][SQL Server]*The name of
the object 'R_dataframe' is not valid.*

Thanks in advance, 
u...@host.com
   

--
View this message in context: 
http://r.789695.n4.nabble.com/SQL-select-where-R-variable-tp4190882p4194629.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL select ... where R variable

2011-12-14 Thread Gabor Grothendieck
On Wed, Dec 14, 2011 at 7:04 AM, agent dunham crossp...@hotmail.com wrote:
 Thank you,

 I guess it didn't work for me, maybe is not possible?

 I've tried:

 con- odbcDriverConnect(Driver=SQL Server;
 Server=...\\...;Database=...;Uid=...;Pwd=... ;)

 v1=sqlQuery(con, select v1 from sqltable where v3 =cte and v2 in (select
 v2 from R_dataframe) order by (select v2 from R_dataframe))

 head(rbind(R_dataframe$v2, v1))
       [,1]
       1251
 v1 42S02 208 [Microsoft][ODBC SQL Server Driver][SQL Server]*The name of
 the object 'R_dataframe' is not valid.*

Since you have changed the problem by introducing new elements into it
clearly the answer must change too.  Either:

1. write R_variable to a table in your database and revise your SQL
statement so that its valid SQL or
2. if there are only a few elements in R_variable$x e.g. making it a
plain vector with R_variable - 1:10 then construct the appropriate
sql statement:

R_variable - 1:10
sql_stmt - sprintf(select v1
   from sqltable
   where v2 in ( %s )
   order by v2
, toString(R_variable))

which gives:

 cat(sql_stmt)
select v1
   from sql_table
   where v2 in ( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 )
   order by v2




-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL select ... where R variable

2011-12-13 Thread agent dunham
Before loading sql packages, i was wondering, once all packages required are
installed, if  it's possible to do from R console sth like this: 

SQL select v1 from sql_table
where v2 in (R_variable)  # being R_variable a vector

Thanks in advance, u...@host.com

--
View this message in context: 
http://r.789695.n4.nabble.com/SQL-select-where-R-variable-tp4190882p4190882.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL select ... where R variable

2011-12-13 Thread Gabor Grothendieck
On Tue, Dec 13, 2011 at 10:54 AM, agent dunham crossp...@hotmail.com wrote:
 Before loading sql packages, i was wondering, once all packages required are
 installed, if  it's possible to do from R console sth like this:

 SQL select v1 from sql_table
        where v2 in (R_variable)  # being R_variable a vector


Try this:

# test data
R_variable - data.frame(x = 1:10)
sql_table - data.frame(v1 = 1:10, v2 = rep(0:1, each = 5))

library(sqldf)
sqldf(select v1 from sql_table where v2 in (select x from R_variable))

The result of the last line is:
  v1
1  6
2  7
3  8
4  9
5 10

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL Changing Data Type

2011-06-24 Thread GL
Passing in two dates to a sql statement (sqldf). Is returning a factor. Tried
setting back to a Date via as.Date, but get an error the error: character
string is not in a standard unambiguous format. Any thoughts appreciated. 

Code/Results listed below:

 summary(df.possible.combos)
  Date Hour  
 Min.   :2011-03-01   Min.   : 0.00  
 1st Qu.:2011-03-23   1st Qu.: 5.75  
 Median :2011-04-14   Median :11.50  
 Mean   :2011-04-14   Mean   :11.50  
 3rd Qu.:2011-05-06   3rd Qu.:17.25  
 Max.   :2011-05-31   Max.   :23.00  
 summary(df.aggregate)   
  Date Hour   x 
 Min.   :2011-03-01   16 : 82   Min.   : 1.000  
 1st Qu.:2011-03-22   17 : 82   1st Qu.: 1.000  
 Median :2011-04-13   18 : 82   Median : 2.000  
 Mean   :2011-04-14   19 : 79   Mean   : 4.195  
 3rd Qu.:2011-05-07   20 : 76   3rd Qu.: 7.000  
 Max.   :2011-05-31   7  : 75   Max.   :20.000  
  (Other):377   
 #merge raw data and all possible combinations
   df.final - sqldf('select Date, Hour, x as RoomsInUse from
 df.aggregate
+ left join df.possible.combos using (Hour, Date)')
 summary(df.final)
  Date  Hour   RoomsInUse
 15069.0: 16   16 : 82   Min.   : 1.000  
 15114.0: 16   17 : 82   1st Qu.: 1.000  
 15063.0: 15   18 : 82   Median : 2.000  
 15082.0: 15   19 : 79   Mean   : 4.195  
 15125.0: 15   20 : 76   3rd Qu.: 7.000  
 15044.0: 14   7  : 75   Max.   :20.000  
 (Other):762   (Other):377   
 thedate - as.Date(df.final$Date)
Error in charToDate(x) : 
  character string is not in a standard unambiguous format
 


--
View this message in context: 
http://r.789695.n4.nabble.com/SQL-Changing-Data-Type-tp3623508p3623508.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Changing Data Type

2011-06-24 Thread Dennis Murphy
Hi:

str() probably would have been more informative than summary(), but if
the date is a factor, perhaps this will help:

as.Date(as.character(Date), format = '%Y-%m-%d')

Including the format argument is usually safer than letting the
function try to figure it out on its own. Neglecting to include it may
well be the source of the error message.

You might also look into the POSIX classes if you want to combine date
and time. Start with ?DateTimeClasses to learn more.

HTH,
Dennis

On Fri, Jun 24, 2011 at 1:12 PM, GL pfl...@shands.ufl.edu wrote:
 Passing in two dates to a sql statement (sqldf). Is returning a factor. Tried
 setting back to a Date via as.Date, but get an error the error: character
 string is not in a standard unambiguous format. Any thoughts appreciated.

 Code/Results listed below:

 summary(df.possible.combos)
      Date                 Hour
  Min.   :2011-03-01   Min.   : 0.00
  1st Qu.:2011-03-23   1st Qu.: 5.75
  Median :2011-04-14   Median :11.50
  Mean   :2011-04-14   Mean   :11.50
  3rd Qu.:2011-05-06   3rd Qu.:17.25
  Max.   :2011-05-31   Max.   :23.00
 summary(df.aggregate)
      Date                 Hour           x
  Min.   :2011-03-01   16     : 82   Min.   : 1.000
  1st Qu.:2011-03-22   17     : 82   1st Qu.: 1.000
  Median :2011-04-13   18     : 82   Median : 2.000
  Mean   :2011-04-14   19     : 79   Mean   : 4.195
  3rd Qu.:2011-05-07   20     : 76   3rd Qu.: 7.000
  Max.   :2011-05-31   7      : 75   Max.   :20.000
                      (Other):377
 #merge raw data and all possible combinations
   df.final - sqldf('select Date, Hour, x as RoomsInUse from
 df.aggregate
 +     left join df.possible.combos using (Hour, Date)')
 summary(df.final)
      Date          Hour       RoomsInUse
  15069.0: 16   16     : 82   Min.   : 1.000
  15114.0: 16   17     : 82   1st Qu.: 1.000
  15063.0: 15   18     : 82   Median : 2.000
  15082.0: 15   19     : 79   Mean   : 4.195
  15125.0: 15   20     : 76   3rd Qu.: 7.000
  15044.0: 14   7      : 75   Max.   :20.000
  (Other):762   (Other):377
 thedate - as.Date(df.final$Date)
 Error in charToDate(x) :
  character string is not in a standard unambiguous format



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/SQL-Changing-Data-Type-tp3623508p3623508.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Changing Data Type

2011-06-24 Thread Gabor Grothendieck
On Fri, Jun 24, 2011 at 4:12 PM, GL pfl...@shands.ufl.edu wrote:
 Passing in two dates to a sql statement (sqldf). Is returning a factor. Tried
 setting back to a Date via as.Date, but get an error the error: character
 string is not in a standard unambiguous format. Any thoughts appreciated.

 Code/Results listed below:

 summary(df.possible.combos)
      Date                 Hour
  Min.   :2011-03-01   Min.   : 0.00
  1st Qu.:2011-03-23   1st Qu.: 5.75
  Median :2011-04-14   Median :11.50
  Mean   :2011-04-14   Mean   :11.50
  3rd Qu.:2011-05-06   3rd Qu.:17.25
  Max.   :2011-05-31   Max.   :23.00
 summary(df.aggregate)
      Date                 Hour           x
  Min.   :2011-03-01   16     : 82   Min.   : 1.000
  1st Qu.:2011-03-22   17     : 82   1st Qu.: 1.000
  Median :2011-04-13   18     : 82   Median : 2.000
  Mean   :2011-04-14   19     : 79   Mean   : 4.195
  3rd Qu.:2011-05-07   20     : 76   3rd Qu.: 7.000
  Max.   :2011-05-31   7      : 75   Max.   :20.000
                      (Other):377
 #merge raw data and all possible combinations
   df.final - sqldf('select Date, Hour, x as RoomsInUse from
 df.aggregate
 +     left join df.possible.combos using (Hour, Date)')
 summary(df.final)
      Date          Hour       RoomsInUse
  15069.0: 16   16     : 82   Min.   : 1.000
  15114.0: 16   17     : 82   1st Qu.: 1.000
  15063.0: 15   18     : 82   Median : 2.000
  15082.0: 15   19     : 79   Mean   : 4.195
  15125.0: 15   20     : 76   3rd Qu.: 7.000
  15044.0: 14   7      : 75   Max.   :20.000
  (Other):762   (Other):377
 thedate - as.Date(df.final$Date)
 Error in charToDate(x) :
  character string is not in a standard unambiguous format


Please read the last line to every message to r-help and provide this
in reproducible form as requested.  Use dput to display all inputs and
showing all code. Also ensure you are using the latest version of
sqldf which is 0.4-1.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Decimals in R/SQL

2011-04-13 Thread Rachel Licata
Hello,

When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
The only way I can get decimals is by multiplying by 1.0, etc.  I
have tried manipulating the options, but it is only effective once I
multiply by 1..  

I appreciate any suggestions!

Thanks! 

Example:

z - sqldf (select ST,
SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from
ss09 group by ST)

z shouldn't be whole numbers.  


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decimals in R/SQL

2011-04-13 Thread jim holtman
You at least have to provide a subset of 'ss09' so we can see what the
original data looks like.  I have not had any problems with decimals
in using sqldf.

 x - as.data.frame(matrix(runif(100)*100, 10))
 x$key - sample(1:3, 10, TRUE)
 require(sqldf)
 xsum - sqldf('
+ select key, sum(V1 * V2) / sum(V3)
+ from x
+ group by key
+ ')


 xsum
  key sum(V1 * V2) / sum(V3)
1   1   19.38166
2   2   17.40503
3   3   71.48818
 dput(xsum)
structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
17.4050302312273, 71.4881812227571)), .Names = c(key, sum(V1 * V2) / sum(V3)
), row.names = c(NA, 3L), class = data.frame)




On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata rach...@kff.org wrote:
 Hello,

 When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
 The only way I can get decimals is by multiplying by 1.0, etc.  I
 have tried manipulating the options, but it is only effective once I
 multiply by 1..

 I appreciate any suggestions!

 Thanks!

 Example:

 z - sqldf (select ST,
 SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from
 ss09 group by ST)

 z shouldn't be whole numbers.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decimals in R/SQL

2011-04-13 Thread Rachel Licata
Thanks Jim.  It appears the issue may only be in SQLite.  SS09 is a large table 
and here is a subset of the variables I am working with.  


SS09
ST  AGEPPWGTP
33323130 130
33324110 186
333251 2 162
33326180  93
33327129 135
33328166  54
33329162  54
0121 138
1129 103
21 7 144
31 5 143

z - dbGetQuery( connSQLite , select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage 
from ss09 group by ST)

ST wgtage
1   1 37
2   2 33
3   4 36
4   5 37
5   6 35

z - dbGetQuery( connSQLite , select ST, 
SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 
group by ST)

ST   wgtage
1   1 37.57083
2   2 33.94322
3   4 36.14499
4   5 37.51233
5   6 35.65581

-Original Message-
From: jim holtman [mailto:jholt...@gmail.com] 
Sent: Wednesday, April 13, 2011 12:16 PM
To: Rachel Licata
Cc: r-help@r-project.org
Subject: Re: [R] Decimals in R/SQL

You at least have to provide a subset of 'ss09' so we can see what the
original data looks like.  I have not had any problems with decimals
in using sqldf.

 x - as.data.frame(matrix(runif(100)*100, 10))
 x$key - sample(1:3, 10, TRUE)
 require(sqldf)
 xsum - sqldf('
+ select key, sum(V1 * V2) / sum(V3)
+ from x
+ group by key
+ ')


 xsum
  key sum(V1 * V2) / sum(V3)
1   1   19.38166
2   2   17.40503
3   3   71.48818
 dput(xsum)
structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
17.4050302312273, 71.4881812227571)), .Names = c(key, sum(V1 * V2) / sum(V3)
), row.names = c(NA, 3L), class = data.frame)




On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata rach...@kff.org wrote:
 Hello,

 When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
 The only way I can get decimals is by multiplying by 1.0, etc.  I
 have tried manipulating the options, but it is only effective once I
 multiply by 1..

 I appreciate any suggestions!

 Thanks!

 Example:

 z - sqldf (select ST,
 SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from
 ss09 group by ST)

 z shouldn't be whole numbers.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decimals in R/SQL

2011-04-13 Thread jim holtman
The problem is that you data is 'integer' and I assume that the
database is keeping everything integer.  You can do what you are
doing, or convert to 'numeric':

 x - read.table(textConnection(   ST  AGEP   
  PWGTP
+ 33323130 130
+ 33324110 186
+ 333251 2 162
+ 33326180  93
+ 33327129 135
+ 33328166  54
+ 33329162  54
+ 0121 138
+ 1129 103
+ 21 7 144
+ 31 5 143), header = TRUE)
 closeAllConnections()
 str(x)
'data.frame':   11 obs. of  3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : int  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...
 require(sqldf)
 xsum - sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+ from x
+ group by ST
+ ')
 xsum
  ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1 23
 # change to numeric instead of integer
 x$AGEP - as.numeric(x$AGEP)
 str(x)
'data.frame':   11 obs. of  3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : num  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...
 xsum - sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+ from x
+ group by ST
+ ')
 xsum
  ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1   23.81446



On Wed, Apr 13, 2011 at 12:42 PM, Rachel Licata rach...@kff.org wrote:
 Thanks Jim.  It appears the issue may only be in SQLite.  SS09 is a large 
 table and here is a subset of the variables I am working with.


 SS09
            ST                  AGEP                    PWGTP
 33323    1                30                 130
 33324    1                10                 186
 33325    1                 2                 162
 33326    1                80                  93
 33327    1                29                 135
 33328    1                66                  54
 33329    1                62                  54
 0    1                21                 138
 1    1                29                 103
 2    1                 7                 144
 3    1                 5                 143

 z - dbGetQuery( connSQLite , select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage 
 from ss09 group by ST)

 ST wgtage
 1   1     37
 2   2     33
 3   4     36
 4   5     37
 5   6     35

 z - dbGetQuery( connSQLite , select ST, 
 SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 
 group by ST)

 ST   wgtage
 1   1 37.57083
 2   2 33.94322
 3   4 36.14499
 4   5 37.51233
 5   6 35.65581

 -Original Message-
 From: jim holtman [mailto:jholt...@gmail.com]
 Sent: Wednesday, April 13, 2011 12:16 PM
 To: Rachel Licata
 Cc: r-help@r-project.org
 Subject: Re: [R] Decimals in R/SQL

 You at least have to provide a subset of 'ss09' so we can see what the
 original data looks like.  I have not had any problems with decimals
 in using sqldf.

 x - as.data.frame(matrix(runif(100)*100, 10))
 x$key - sample(1:3, 10, TRUE)
 require(sqldf)
 xsum - sqldf('
 +     select key, sum(V1 * V2) / sum(V3)
 +         from x
 +         group by key
 + ')


 xsum
  key sum(V1 * V2) / sum(V3)
 1   1               19.38166
 2   2               17.40503
 3   3               71.48818
 dput(xsum)
 structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
 17.4050302312273, 71.4881812227571)), .Names = c(key, sum(V1 * V2) / 
 sum(V3)
 ), row.names = c(NA, 3L), class = data.frame)




 On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata rach...@kff.org wrote:
 Hello,

 When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
 The only way I can get decimals is by multiplying by 1.0, etc.  I
 have tried manipulating the options, but it is only effective once I
 multiply by 1..

 I appreciate any suggestions!

 Thanks!

 Example:

 z - sqldf (select ST,
 SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from
 ss09 group by ST)

 z shouldn't be whole numbers.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help

Re: [R] Decimals in R/SQL

2011-04-13 Thread Rachel Licata
Thanks again Jim - that is really helpful and I apologize that I am new to R.  
How can I convert to numeric in SQL and when I am working on a table in a 
database?  The file is huge so that is why I am using SQL and the database to 
work through it.  
Thanks!

-Original Message-
From: jim holtman [mailto:jholt...@gmail.com] 
Sent: Wednesday, April 13, 2011 12:52 PM
To: Rachel Licata
Cc: r-help@r-project.org
Subject: Re: [R] Decimals in R/SQL

The problem is that you data is 'integer' and I assume that the
database is keeping everything integer.  You can do what you are
doing, or convert to 'numeric':

 x - read.table(textConnection(   ST  AGEP   
  PWGTP
+ 33323130 130
+ 33324110 186
+ 333251 2 162
+ 33326180  93
+ 33327129 135
+ 33328166  54
+ 33329162  54
+ 0121 138
+ 1129 103
+ 21 7 144
+ 31 5 143), header = TRUE)
 closeAllConnections()
 str(x)
'data.frame':   11 obs. of  3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : int  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...
 require(sqldf)
 xsum - sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+ from x
+ group by ST
+ ')
 xsum
  ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1 23
 # change to numeric instead of integer
 x$AGEP - as.numeric(x$AGEP)
 str(x)
'data.frame':   11 obs. of  3 variables:
 $ ST   : int  1 1 1 1 1 1 1 1 1 1 ...
 $ AGEP : num  30 10 2 80 29 66 62 21 29 7 ...
 $ PWGTP: int  130 186 162 93 135 54 54 138 103 144 ...
 xsum - sqldf('
+ select ST, sum(AGEP * PWGTP) / sum(PWGTP)
+ from x
+ group by ST
+ ')
 xsum
  ST sum(AGEP * PWGTP) / sum(PWGTP)
1  1   23.81446



On Wed, Apr 13, 2011 at 12:42 PM, Rachel Licata rach...@kff.org wrote:
 Thanks Jim.  It appears the issue may only be in SQLite.  SS09 is a large 
 table and here is a subset of the variables I am working with.


 SS09
            ST                  AGEP                    PWGTP
 33323    1                30                 130
 33324    1                10                 186
 33325    1                 2                 162
 33326    1                80                  93
 33327    1                29                 135
 33328    1                66                  54
 33329    1                62                  54
 0    1                21                 138
 1    1                29                 103
 2    1                 7                 144
 3    1                 5                 143

 z - dbGetQuery( connSQLite , select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage 
 from ss09 group by ST)

 ST wgtage
 1   1     37
 2   2     33
 3   4     36
 4   5     37
 5   6     35

 z - dbGetQuery( connSQLite , select ST, 
 SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 
 group by ST)

 ST   wgtage
 1   1 37.57083
 2   2 33.94322
 3   4 36.14499
 4   5 37.51233
 5   6 35.65581

 -Original Message-
 From: jim holtman [mailto:jholt...@gmail.com]
 Sent: Wednesday, April 13, 2011 12:16 PM
 To: Rachel Licata
 Cc: r-help@r-project.org
 Subject: Re: [R] Decimals in R/SQL

 You at least have to provide a subset of 'ss09' so we can see what the
 original data looks like.  I have not had any problems with decimals
 in using sqldf.

 x - as.data.frame(matrix(runif(100)*100, 10))
 x$key - sample(1:3, 10, TRUE)
 require(sqldf)
 xsum - sqldf('
 +     select key, sum(V1 * V2) / sum(V3)
 +         from x
 +         group by key
 + ')


 xsum
  key sum(V1 * V2) / sum(V3)
 1   1               19.38166
 2   2               17.40503
 3   3               71.48818
 dput(xsum)
 structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268,
 17.4050302312273, 71.4881812227571)), .Names = c(key, sum(V1 * V2) / 
 sum(V3)
 ), row.names = c(NA, 3L), class = data.frame)




 On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata rach...@kff.org wrote:
 Hello,

 When I am writing in sqldf or RSQLite I lose the decimals in my matrix.
 The only way I can get decimals is by multiplying by 1.0, etc.  I
 have tried manipulating the options, but it is only effective once I
 multiply by 1..

 I appreciate any suggestions!

 Thanks!

 Example:

 z - sqldf (select ST,
 SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from
 ss09 group by ST)

 z shouldn't be whole numbers.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help

Re: [R] Decimals in R/SQL

2011-04-13 Thread Gabor Grothendieck
On Wed, Apr 13, 2011 at 4:34 PM, Rachel Licata rach...@kff.org wrote:
 Thanks again Jim - that is really helpful and I apologize that I am new to R. 
  How can I convert to numeric in SQL and when I am working on a table in a 
 database?  The file is huge so that is why I am using SQL and the database to 
 work through it.
 Thanks!


There are some examples here (a few of them, as noted, require the
development version of sqldf):

http://code.google.com/p/sqldf/#15._Why_do_certain_calculations_come_out_as_integer_rather_than

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Decimals in R/SQL

2011-04-13 Thread Seth Falcon
On Wed, Apr 13, 2011 at 1:34 PM, Rachel Licata rach...@kff.org wrote:

 Thanks again Jim - that is really helpful and I apologize that I am
 new to R.  How can I convert to numeric in SQL and when I am working
 on a table in a database?  The file is huge so that is why I am
 using SQL and the database to work through it.

I believe that RSQLite will do the right thing if you provide the
correct types in your schema.  So for a new database, you want to make
sure that the columns that you want to be numeric are created like:

CREATE table sometable (my_data REAL);

You should be able to create a new table from an existing table using
SQL where the new table has the types you want.

+ seth


-- 
Seth Falcon | @sfalcon | http://userprimary.net/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL PL/SQL

2011-01-10 Thread Ramesh
Your email client cannot read this email.
To view it online, please go here:
http://email.careermailer.com:2020/display.php?M=161372C=63b37a209d9dced6e8262a1862c616e4S=552L=5N=559


To stop receiving these
emails:http://email.careermailer.com:2020/unsubscribe.php?M=161372C=63b37a209d9dced6e8262a1862c616e4L=5N=552

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL/R

2010-07-26 Thread Bart Joosen

see the manual from package sqldf:
http://cran.r-project.org/web/packages/sqldf/sqldf.pdf

Bart
-- 
View this message in context: 
http://r.789695.n4.nabble.com/SQL-R-tp2298545p2301975.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL/R

2010-07-23 Thread Gildas Mazo
Thanks for your answers,

Best,

Gildas




Brian Diggs a écrit :
 On 7/22/2010 5:01 AM, Allan Engelhardt wrote:
 There are so many ways  Here is one:

 aggregate(v ~ u, data=X, function(...) length(unique(...)))
 # u v
 # 1 T1 2
 # 2 T2 1

 Hope this helps

 Here is one other way, using the plyr package (which is very good for
 taking a data structure (data.frame, list, array), pulling it apart by
 some criteria, doing something on each of the parts, and putting the
 results back together):

 library(plyr)
 ddply(X, .(u), function(x) {length(unique(x$v))})
 #   u V1
 #1 T1  2
 #2 T2  1


 Allan.

 On 22/07/10 12:52, Gildas Mazo wrote:
 Dear R users,

 I want to aggregate data in the following way:

 ###

 X- data.frame(u = c(T1,T1,T1,T2), v=c(a,a,b,a))
 X
 library(sqldf)
 sqlOut- sqldf(select count(distinct(v)) from X group by u)
 sqlOut

 ###

 Now I want to get the same result without using SQL. How can I achieve
 that ?

 Thanks for your help,

 Gildas

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL/R

2010-07-22 Thread Gildas Mazo
Dear R users,

I want to aggregate data in the following way:

###

X - data.frame(u = c(T1,T1,T1,T2), v=c(a,a,b,a))
X
library(sqldf)
sqlOut - sqldf(select count(distinct(v)) from X group by u)
sqlOut

###

Now I want to get the same result without using SQL. How can I achieve
that ?

Thanks for your help,

Gildas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL/R

2010-07-22 Thread Allan Engelhardt

There are so many ways  Here is one:

aggregate(v ~ u, data=X, function(...) length(unique(...)))
#u v
# 1 T1 2
# 2 T2 1

Hope this helps

Allan.

On 22/07/10 12:52, Gildas Mazo wrote:

Dear R users,

I want to aggregate data in the following way:

###

X- data.frame(u = c(T1,T1,T1,T2), v=c(a,a,b,a))
X
library(sqldf)
sqlOut- sqldf(select count(distinct(v)) from X group by u)
sqlOut

###

Now I want to get the same result without using SQL. How can I achieve
that ?

Thanks for your help,

Gildas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL/R

2010-07-22 Thread Brian Diggs

On 7/22/2010 5:01 AM, Allan Engelhardt wrote:

There are so many ways  Here is one:

aggregate(v ~ u, data=X, function(...) length(unique(...)))
# u v
# 1 T1 2
# 2 T2 1

Hope this helps


Here is one other way, using the plyr package (which is very good for 
taking a data structure (data.frame, list, array), pulling it apart by 
some criteria, doing something on each of the parts, and putting the 
results back together):


library(plyr)
ddply(X, .(u), function(x) {length(unique(x$v))})
#   u V1
#1 T1  2
#2 T2  1



Allan.

On 22/07/10 12:52, Gildas Mazo wrote:

Dear R users,

I want to aggregate data in the following way:

###

X- data.frame(u = c(T1,T1,T1,T2), v=c(a,a,b,a))
X
library(sqldf)
sqlOut- sqldf(select count(distinct(v)) from X group by u)
sqlOut

###

Now I want to get the same result without using SQL. How can I achieve
that ?

Thanks for your help,

Gildas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






--
--
Brian Diggs
Senior Research Associate, Department of Surgery, Oregon Health  
Science University


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL-select using native R methods ?

2010-03-22 Thread Jan . Sunde
Hi
I have a problem in R that I have been trying to solve but without 
success.
I am trying to join two tables on two variables : an ID and a date 
(optional) that will be common between the two tables

In SQL (and SAS PROC SQL) I am a frequent user of the select command and 
I am used to the following nomenclature : 

select a.*, b.c, b.y, b.z from table1 a, table2 b where a.date=b.date and 
a.id=b.id

I tried this in R (using sqldf) but it takes wy too long to get the 
result. My data sets are 1 gb each.
Is there any way this can be done by merge() or any other more R-like 
way ? 
I have tried making a compound variable with paste(id, as.character(date), 
sep=) and merge() on that but since the date more often than not is NA
this doesnt seem to work. I'm a bit stuck and the documentation is of 
little help to me in this case. 
If anybody would be kind to point me in the right direction I would be 
grateful.

Jan


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL-select using native R methods ?

2010-03-22 Thread Gabor Grothendieck
On Mon, Mar 22, 2010 at 6:37 AM,  jan.su...@biokapital.no wrote:
 Hi
 I have a problem in R that I have been trying to solve but without
 success.
 I am trying to join two tables on two variables : an ID and a date
 (optional) that will be common between the two tables

 In SQL (and SAS PROC SQL) I am a frequent user of the select command and
 I am used to the following nomenclature :

 select a.*, b.c, b.y, b.z from table1 a, table2 b where a.date=b.date and
 a.id=b.id

 I tried this in R (using sqldf) but it takes wy too long to get the
 result. My data sets are 1 gb each.

Try adding indexes to the join columns.  I was able to do a join
between two 1 million row tables in 13 seconds (under a minute
including the time to add the indexes) on a laptop (not a particularly
fast machine).

 set.seed(1)
 n - 100
 DF1 - data.frame(a = sample(n, n), b1 = runif(n))
 DF2 - data.frame(a = sample(n, n), b2 = runif(n))
 library(sqldf)

 sqldf()
SQLiteConnection:(2708,1)
 system.time(sqldf(create index ai1 on DF1(a)))
   user  system elapsed
  16.630.09   16.85
 system.time(sqldf(create index ai2 on DF2(a)))
   user  system elapsed
  16.060.04   16.30
 system.time(sqldf(select * from main.DF1 natural join main.DF2))
   user  system elapsed
  12.720.21   12.93
 sqldf()


 Is there any way this can be done by merge() or any other more R-like
 way ?
 I have tried making a compound variable with paste(id, as.character(date),
 sep=) and merge() on that but since the date more often than not is NA
 this doesnt seem to work. I'm a bit stuck and the documentation is of
 little help to me in this case.
 If anybody would be kind to point me in the right direction I would be
 grateful.

 Jan


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sql query variable

2010-02-18 Thread RagingJim

Thanks guys. I ended up doing as you suggested Dieter. Thanks for the idea :)
-- 
View this message in context: 
http://n4.nabble.com/sql-query-variable-tp1558189p1561158.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sql query variable

2010-02-17 Thread Bart Joosen

Another way:

cat(Station number?)
flush.console()
sn - scan(quiet=TRUE,n=1)
sqlQuery(conn, paste(select to_char(lsd,'-mm') as yr,ttl_mo_prcp from
mo_rains where stn_num=,sn,sep=))

Bart

-- 
View this message in context: 
http://n4.nabble.com/sql-query-variable-tp1558189p1558504.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sql query variable

2010-02-16 Thread RagingJim

This is the very last thing I need to make everything work properly. My
query:

sqlQuery(conn, select to_char(lsd,'-mm') as yr,ttl_mo_prcp from
mo_rains where stn_num=023000)

Is there a way to may the stn_num in the query variable, ie make it so that
whenever my script is run, the user must choose and input the station
number?

Cheers.
-- 
View this message in context: 
http://n4.nabble.com/sql-query-variable-tp1558189p1558189.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sql query variable

2010-02-16 Thread Dieter Menne


RagingJim wrote:
 
 This is the very last thing I need to make everything work properly. My
 query:
 sqlQuery(conn, select to_char(lsd,'-mm') as yr,ttl_mo_prcp from
 mo_rains where stn_num=023000)
 
 Is there a way to may the stn_num in the query variable, ie make it so
 that whenever my script is run, the user must choose and input the station
 number?
 


You could use tclTk to input the stn_num, and generate the query string with
paste()

Dieter

-- 
View this message in context: 
http://n4.nabble.com/sql-query-variable-tp1558189p1558308.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL on R

2009-07-05 Thread JoK LoQ

I'm dealing with lots of columns and conditions, wats t best way to deal with
that?
How do I work with SQL on R? the manual is quite confuse talking about that.
Do I need a package?
-- 
View this message in context: 
http://www.nabble.com/SQL-on-R-tp24340314p24340314.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL on R

2009-07-05 Thread Bernardo Rangel Tura
On Sat, 2009-07-04 at 22:24 -0700, JoK LoQ wrote:
 I'm dealing with lots of columns and conditions, wats t best way to deal with
 that?
 How do I work with SQL on R? the manual is quite confuse talking about that.
 Do I need a package?

I don't understand your question, but 

if you think use SQL in your data frames do you use SQLDF
(http://cran.r-project.org/web/packages/sqldf/index.html) 

if you thinks use a database server and access it in R i sugest you use
RMySQL (http://cran.r-project.org/web/packages/RMySQL/index.html)
-- 
Bernardo Rangel Tura, M.D,MPH,Ph.D
National Institute of Cardiology
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL on R

2009-07-05 Thread JoK LoQ

Fala ai cara!

 Era o RMySQL mesmo q eu precisava, vlw


Bernardo Rangel tura wrote:
 
 On Sat, 2009-07-04 at 22:24 -0700, JoK LoQ wrote:
 I'm dealing with lots of columns and conditions, wats t best way to deal
 with
 that?
 How do I work with SQL on R? the manual is quite confuse talking about
 that.
 Do I need a package?
 
 I don't understand your question, but 
 
 if you think use SQL in your data frames do you use SQLDF
 (http://cran.r-project.org/web/packages/sqldf/index.html) 
 
 if you thinks use a database server and access it in R i sugest you use
 RMySQL (http://cran.r-project.org/web/packages/RMySQL/index.html)
 -- 
 Bernardo Rangel Tura, M.D,MPH,Ph.D
 National Institute of Cardiology
 Brazil
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/SQL-on-R-tp24340314p24346040.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Queries from Multiple Servers

2009-05-15 Thread Mark Wardle
Hi.

Depending on your requirements, one option would be to do the join in
R using merge()

If you wish to run SQL joins across multiple databases, then it is not
an R problem but a database problem. For a quick solution, I would
write scripts that bring all your data together into one database
(could be written in any scripting language, and of course R) and then
process from there.

Bw

Mark

2009/5/13 Tom Schenk Jr tomschen...@gmail.com:
 I use RODBC as my conduit from R to SQL. It works well when the tables are
 stored on one channel, e.g.,

 channel - odbcConnect(data_base_01, uid=, dsn=)

 However, I often need to match tables across multiple databases, e.g.,
 data_base_01 and data_base_02. However, odbcConnect() appears limited
 insofar as you may only query from tables within a single channel, e.g.,
 database. I do not have access to write and create new tables on the SQL
 servers, which is a possible solution (e.g., copy all tables into a single
 database).

 Is there any way, in RODBC or another R-friendly SQL package, to perform SQL
 operations across multiple databases?

 Warm regards.

 --
 Tom Schenk Jr.
 tomschen...@gmail.com

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Dr. Mark Wardle
Specialist registrar, Neurology
Cardiff, UK

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Queries from Multiple Servers

2009-05-14 Thread Paul Gilbert

Tom Schenk Jr wrote:

I use RODBC as my conduit from R to SQL. It works well when the tables are
stored on one channel, e.g.,

channel - odbcConnect(data_base_01, uid=, dsn=)

However, I often need to match tables across multiple databases, e.g.,
data_base_01 and data_base_02. However, odbcConnect() appears limited
insofar as you may only query from tables within a single channel, e.g.,
database. I do not have access to write and create new tables on the SQL
servers, which is a possible solution (e.g., copy all tables into a single
database).

Is there any way, in RODBC or another R-friendly SQL package, to perform SQL
operations across multiple databases?
  
I'm not sure if this can be done with odbc, but with MySQL it is 
possible to do joins across multiple databases, and creating temporary 
tables may be possible even without the write access you would need for 
a permanent table. I'm not sure if you can pass this kind of statement 
from R, because the connection usually specifies the database. However, 
I have constructed temporary tables with a simple mysql client and then 
queried them from R. They stay around as long as you don't quit the 
simple client.  I am not really sure this is suppose to work.  Another 
option is two connections and do some of the comparison in R, or write 
the results to an SQLite connection, on which you usually have write 
access. This might be slow  and you may have to deal with chunks if you 
have big tables.  Joins across databases are also possible with 
PostgreSQL, I'm told, but they are more difficult.


Paul

Warm regards.

  



La version française suit le texte anglais.



This email may contain privileged and/or confidential in...{{dropped:26}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL Queries from Multiple Servers

2009-05-13 Thread Tom Schenk Jr
I use RODBC as my conduit from R to SQL. It works well when the tables are
stored on one channel, e.g.,

channel - odbcConnect(data_base_01, uid=, dsn=)

However, I often need to match tables across multiple databases, e.g.,
data_base_01 and data_base_02. However, odbcConnect() appears limited
insofar as you may only query from tables within a single channel, e.g.,
database. I do not have access to write and create new tables on the SQL
servers, which is a possible solution (e.g., copy all tables into a single
database).

Is there any way, in RODBC or another R-friendly SQL package, to perform SQL
operations across multiple databases?

Warm regards.

-- 
Tom Schenk Jr.
tomschen...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Queries from Multiple Servers

2009-05-13 Thread Jeff Newmiller

Tom Schenk Jr wrote:

I use RODBC as my conduit from R to SQL. It works well when the tables are
stored on one channel, e.g.,

channel - odbcConnect(data_base_01, uid=, dsn=)

However, I often need to match tables across multiple databases, e.g.,
data_base_01 and data_base_02. However, odbcConnect() appears limited
insofar as you may only query from tables within a single channel, e.g.,
database. I do not have access to write and create new tables on the SQL
servers, which is a possible solution (e.g., copy all tables into a single
database).

Is there any way, in RODBC or another R-friendly SQL package, to perform SQL
operations across multiple databases?


Sounds like you want... a SQL database engine.  Some engines provide
features to work with external tables... but it is not standardized and
it is not usually very efficient. You usually have to have some level of
admin privilege to do this.

Alternatively, you can create a new database using an engine of your
choice, move the data to that common working database, and do your
cross-queries to your heart's content. Depending on your working
environment, SQLite, MySQL, MS Jet, or MSDE could be options. Or, if your
joins are fairly small you can use R's merge function and doBy.
It seems a bit much to me to expect a database access library to
do double duty as a SQL database engine, though.

--
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Primer for R

2008-09-01 Thread Steve Revilak

Date: Sun, 31 Aug 2008 21:29:38 -0400
From: ivo welch
Subject: Re: [R] SQL Primer for R



stumped again by SQL...  If I have a table named main in an SQLite
data base, how do I get the names of all its columns?  (I have a mysql
book that claims the SHOW command does this sort of thing, but it does
not seem to work on SQLite.)


It sounds like SQLite's .schema command might be you're looking for.
Here's an example:

  $ sqlite3 foo.db
  SQLite version 3.5.4
  Enter .help for instructions
  sqlite create table T (c1 integer, c2 integer, c3 integer);
  sqlite .tables
  T
  sqlite .schema T
  CREATE TABLE T (c1 integer, c2 integer, c3 integer);
  sqlite .quit

Steve Revilak

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Primer for R

2008-09-01 Thread ivo welch
wow!  the answer seems to be pragma table_info(main);  thanks, Gabor.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Primer for R

2008-09-01 Thread Duncan Murdoch

ivo welch wrote:

stumped again by SQL...  If I have a table named main in an SQLite
data base, how do I get the names of all its columns?  (I have a mysql
book that claims the SHOW command does this sort of thing, but it does
not seem to work on SQLite.)
  
If you are using RSQLite, then a lot of the obscure commands are 
available as R functions.  In this case I think you want sqliteTableFields.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Primer for R

2008-08-31 Thread ivo welch
stumped again by SQL...  If I have a table named main in an SQLite
data base, how do I get the names of all its columns?  (I have a mysql
book that claims the SHOW command does this sort of thing, but it does
not seem to work on SQLite.)

regards,  /iaw

PS: Thanks for the earlier emails on warn=2.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Primer for R

2008-08-31 Thread Gabor Grothendieck
On Sun, Aug 31, 2008 at 9:29 PM, ivo welch [EMAIL PROTECTED] wrote:
 stumped again by SQL...  If I have a table named main in an SQLite
 data base, how do I get the names of all its columns?  (I have a mysql
 book that claims the SHOW command does this sort of thing, but it does
 not seem to work on SQLite.)

look up the pragma command.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Primer for R

2008-08-26 Thread ivo welch
Sorry, chaps.  I need one more:

 dbDisconnect(con.in)
Error in sqliteCloseConnection(conn, ...) :
  RS-DBI driver: (close the pending result sets before closing this connection)


I am pretty sure I have fetched everything there is to be fetched.  I
am not sure what I need to do to say goodbye (or to find out what is
still pending).  ?dbDisconnect doesn't tell me.  PS: the documentation
for dbConnect should probably add dbDisconnect to its 'See also'
section.

regards,

/iaw

Really irrelevant PS: the by function could keep the number of
observations that go into each category.  I know it can be computed
separately, which is what I am doing now.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Primer for R

2008-08-26 Thread Thomas Lumley

On Tue, 26 Aug 2008, ivo welch wrote:


Sorry, chaps.  I need one more:


dbDisconnect(con.in)

Error in sqliteCloseConnection(conn, ...) :
 RS-DBI driver: (close the pending result sets before closing this connection)




I am pretty sure I have fetched everything there is to be fetched.


dbClearResult

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SQL Primer for R

2008-08-25 Thread ivo welch
Dear R wizards:

I decided to take the advice in the R data import/export manual and
want to learn how to work with SQL for large data sets.  I am trying
SQLite with the DBI and RSQLite database interfaces.  Speed is nice.
Alas, I am struggling to find a tutorial that is geared for the kind
of standard operations that I would want in R.  Simple things:

*  how to determine the number of rows in a table.  (Of course, I
could select a row of data and then use this.)

*  how to insert a new column into my existing SQL table---say, the
rank of another variable---and save it back.  Am I supposed to create
a new data frame, then save it as a new table, then delete the old SQL
table?

*  how to save a revised version of my table in a different sort order
 (with or without deleting the original table).  -- I guess this is
not appropriate, as I should think of SQL tables as unordered.

I guess these would make nice little text snippets in the R Data
import/export manual, too.  help appreciated.

regards,

/ivo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Primer for R

2008-08-25 Thread Ajay ohri
R 's ability to work with large data sets is limited on PC. I am
trying it on cloud to overcome this but need help from other
programmers.

Also tutorials would help to newbies

regards,

Ajay

On Mon, Aug 25, 2008 at 6:42 PM, ivo welch [EMAIL PROTECTED] wrote:
 Dear R wizards:

 I decided to take the advice in the R data import/export manual and
 want to learn how to work with SQL for large data sets.  I am trying
 SQLite with the DBI and RSQLite database interfaces.  Speed is nice.
 Alas, I am struggling to find a tutorial that is geared for the kind
 of standard operations that I would want in R.  Simple things:

 *  how to determine the number of rows in a table.  (Of course, I
 could select a row of data and then use this.)

 *  how to insert a new column into my existing SQL table---say, the
 rank of another variable---and save it back.  Am I supposed to create
 a new data frame, then save it as a new table, then delete the old SQL
 table?

 *  how to save a revised version of my table in a different sort order
  (with or without deleting the original table).  -- I guess this is
 not appropriate, as I should think of SQL tables as unordered.

 I guess these would make nice little text snippets in the R Data
 import/export manual, too.  help appreciated.

 regards,

 /ivo

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Regards,

Ajay Ohri
http://tinyurl.com/liajayohri

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Primer for R

2008-08-25 Thread Gabor Grothendieck
The sqldf package's home page

http://sqldf.googlecode.com

has quite a few examples as well as links to resources.  Also see the
examples in ?sqldf

On Mon, Aug 25, 2008 at 9:12 AM, ivo welch [EMAIL PROTECTED] wrote:
 Dear R wizards:

 I decided to take the advice in the R data import/export manual and
 want to learn how to work with SQL for large data sets.  I am trying
 SQLite with the DBI and RSQLite database interfaces.  Speed is nice.
 Alas, I am struggling to find a tutorial that is geared for the kind
 of standard operations that I would want in R.  Simple things:

 *  how to determine the number of rows in a table.  (Of course, I
 could select a row of data and then use this.)

 *  how to insert a new column into my existing SQL table---say, the
 rank of another variable---and save it back.  Am I supposed to create
 a new data frame, then save it as a new table, then delete the old SQL
 table?

 *  how to save a revised version of my table in a different sort order
  (with or without deleting the original table).  -- I guess this is
 not appropriate, as I should think of SQL tables as unordered.

 I guess these would make nice little text snippets in the R Data
 import/export manual, too.  help appreciated.

 regards,

 /ivo

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Primer for R

2008-08-25 Thread Gabor Grothendieck
This would be a function of the database software more than R.  For sqlite
a few dozen gigabytes should be ok.  Other databases can handle even
larger requirements.

On Mon, Aug 25, 2008 at 9:17 AM, Ajay ohri [EMAIL PROTECTED] wrote:
 R 's ability to work with large data sets is limited on PC. I am
 trying it on cloud to overcome this but need help from other
 programmers.

 Also tutorials would help to newbies

 regards,

 Ajay

 On Mon, Aug 25, 2008 at 6:42 PM, ivo welch [EMAIL PROTECTED] wrote:
 Dear R wizards:

 I decided to take the advice in the R data import/export manual and
 want to learn how to work with SQL for large data sets.  I am trying
 SQLite with the DBI and RSQLite database interfaces.  Speed is nice.
 Alas, I am struggling to find a tutorial that is geared for the kind
 of standard operations that I would want in R.  Simple things:

 *  how to determine the number of rows in a table.  (Of course, I
 could select a row of data and then use this.)

 *  how to insert a new column into my existing SQL table---say, the
 rank of another variable---and save it back.  Am I supposed to create
 a new data frame, then save it as a new table, then delete the old SQL
 table?

 *  how to save a revised version of my table in a different sort order
  (with or without deleting the original table).  -- I guess this is
 not appropriate, as I should think of SQL tables as unordered.

 I guess these would make nice little text snippets in the R Data
 import/export manual, too.  help appreciated.

 regards,

 /ivo

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Regards,

 Ajay Ohri
 http://tinyurl.com/liajayohri

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL Primer for R

2008-08-25 Thread Thomas Lumley

On Mon, 25 Aug 2008, ivo welch wrote:


Dear R wizards:

I decided to take the advice in the R data import/export manual and
want to learn how to work with SQL for large data sets.  I am trying
SQLite with the DBI and RSQLite database interfaces.  Speed is nice.
Alas, I am struggling to find a tutorial that is geared for the kind
of standard operations that I would want in R.  Simple things:

*  how to determine the number of rows in a table.  (Of course, I
could select a row of data and then use this.)


To get number of rows
  SELECT COUNT(*) FROM table_name
For number of columns the quickest reasonably portable way I know is
  SELECT * FROM table_name LIMIT 1
and then count the columns of the result.


*  how to insert a new column into my existing SQL table---say, the
rank of another variable---and save it back.  Am I supposed to create
a new data frame, then save it as a new table, then delete the old SQL
table?


  ALTER TABLE table_name ADD column_name
(you may have to/want to specify a data type as well)

If you are adding a bunch of columns you might also want to put the new 
columns in a separate table and JOIN the tables, especially if the need 
for these extra columns is sporadic or temporary.



*  how to save a revised version of my table in a different sort order
(with or without deleting the original table).  -- I guess this is
not appropriate, as I should think of SQL tables as unordered.


As you note, SQL tables are conceptually unordered. You can order results 
of a query as you read them:

  SELECT foo, bar FROM table_name ORDER BY baz


-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >