Re: [R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column

2007-08-15 Thread Maciej Hoffman-Wecker

Thank you very much Professor Ripley! Afterwards it seems obvious where to 
look. 

Have a nice day,
Maciej

PS: Yes, my machine has not much memory, but it is sufficient for the smaller 
trial data. 

-Ursprüngliche Nachricht-
Von: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
Gesendet: Dienstag, 14. August 2007 19:14
An: Maciej Hoffman-Wecker
Cc: r-help@stat.math.ethz.ch
Betreff: Re: AW: [R] Import of Access data via RODBC changes column name ("NO" 
to "Expr1014") and the content of the column

On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:

> Dear Professor Ripley,
>
> Thank you very much for your response. I send the problem, as I didn't 
> have any more ideas were to search for the reason. I didn't say this 
> is a R bug, knowing the responses on such mails.-)
>
> But I succeeded in developing a tiny example, that reproduces the bug 
> (wherever it is).

Thank you, that was helpful: much easier to follow that the previous code.

...

>> library(RODBC)
>> .con <- odbcConnectAccess("./test2.mdb") (.d <- try(sqlQuery(.con, 
>> "select * from Tab1")))
>  F1 NO F2
> 1  1  1  1
> 2  2  2  2
> 3  0 NA  1
> 4  1  0  0
>> (.d <- try(sqlQuery(.con, "select F1 , NO , F2 from Tab1")))
>  F1 Expr1001 F2
> 1  10  1
> 2  20  2
> 3  00  1
> 4  10  0
>> close(.con)
>
> So the problem occurs if the column names are specified within the query.
> Is the query "select F1 , NO , F2 from Tab1" invalid?

I believe so. 'NO' is an SQL92 and ODBC reserved word, at least according to 
http://www.bairdgroup.com/reservedwords.cfm

See also http://support.microsoft.com/default.aspx?scid=kb;en-us;286335
which says

   For existing objects with names that contain reserved words, you can
   avoid errors by surrounding the object name with brackets ([ ]).

and lists 'NO' as a reserved word.  RODBC quotes all column names it uses to be 
sure (and knows about most non-standard quoting mechanisms from the ODBC driver 
in use).  But this was a query you generated and so you need to do the quoting.

> Regarding the memory issue, I _knew_ that there must be a reason for 
> the running out of memory space. Sorry for not being more specific. My 
> question than is:
>
> Is there a way to 'reset' the environment without quitting R and 
> restarting it?

Sorry, no.  You cannot move objects in memory.

But why '477Mb' is coming up is still unexplained, and suggests that the 
machine has a peculiar amount of memory or some flag has been used.


>
> Thank you for your help.
>
> Kind regards,
> Maciej
>
>
> -Ursprüngliche Nachricht-----
> Von: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
> Gesendet: Dienstag, 14. August 2007 11:51
> An: Maciej Hoffman-Wecker
> Cc: r-help@stat.math.ethz.ch
> Betreff: Re: [R] Import of Access data via RODBC changes column name ("NO" to 
> "Expr1014") and the content of the column
>
> On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:
>
>>
>> Dear all,
>>
>> I have some problems with importing data from an Access data base via
>> RODBC to R. The data base contains several tables, which all are
>> imported consecutively. One table has a column with column name "NO".
>> If I run the code attached on the bottom of the mail I get no
>> complain, but the column name (name of the respective vector of the
>> data.frame) is "Expr1014" instead of "NO". Additionally the original
>> column (type
>> "text") containes "0"s and missings, but the imported column contains
>> "0"s only (type "int"). If I change the column name in the Access data
>> base to "NOx", the import works fine with the right name and the same
>> data.
>>
>> Previously I generated a tiny Access data base which reproduced the
>> problem. To be on the safe site I installed the latest version (2.5.1)
>> and now the example works fine, but within my production process the
>> error still remaines. An import into excel via ODBC works fine.
>>
>> So there is no way to figure it out whether this is a bug or a
>> feature.-)
>
> It's most likely an ODBC issue, but you have not provided a reproducible 
> example.
>
>> The second problem I have is that when I rerun "rm(list = ls(all =
>> T)); gc()" and the import several times I get the following error:
>>
>> Error in odbcTables(channel) : Calloc could not allocate (263168 of 1)
>> memory In addition: Warning messages:
>> 1: Reached total allocation of 447Mb: see help(memory.size) in:
>> odbcQuery(channel, q

Re: [R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column

2007-08-14 Thread Prof Brian Ripley

On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:


Dear Professor Ripley,

Thank you very much for your response. I send the problem, as I didn't 
have any more ideas were to search for the reason. I didn't say this is 
a R bug, knowing the responses on such mails.-)


But I succeeded in developing a tiny example, that reproduces the bug 
(wherever it is).


Thank you, that was helpful: much easier to follow that the previous code.

...


library(RODBC)
.con <- odbcConnectAccess("./test2.mdb")
(.d <- try(sqlQuery(.con, "select * from Tab1")))

 F1 NO F2
1  1  1  1
2  2  2  2
3  0 NA  1
4  1  0  0

(.d <- try(sqlQuery(.con, "select F1 , NO , F2 from Tab1")))

 F1 Expr1001 F2
1  10  1
2  20  2
3  00  1
4  10  0

close(.con)


So the problem occurs if the column names are specified within the query.
Is the query "select F1 , NO , F2 from Tab1" invalid?


I believe so. 'NO' is an SQL92 and ODBC reserved word, at least according 
to http://www.bairdgroup.com/reservedwords.cfm


See also http://support.microsoft.com/default.aspx?scid=kb;en-us;286335
which says

  For existing objects with names that contain reserved words, you can
  avoid errors by surrounding the object name with brackets ([ ]).

and lists 'NO' as a reserved word.  RODBC quotes all column names it uses 
to be sure (and knows about most non-standard quoting mechanisms from the 
ODBC driver in use).  But this was a query you generated and so you need 
to do the quoting.


Regarding the memory issue, I _knew_ that there must be a reason for the 
running out of memory space. Sorry for not being more specific. My 
question than is:


Is there a way to 'reset' the environment without quitting R and 
restarting it?


Sorry, no.  You cannot move objects in memory.

But why '477Mb' is coming up is still unexplained, and suggests that the 
machine has a peculiar amount of memory or some flag has been used.





Thank you for your help.

Kind regards,
Maciej


-Ursprüngliche Nachricht-
Von: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
Gesendet: Dienstag, 14. August 2007 11:51
An: Maciej Hoffman-Wecker
Cc: r-help@stat.math.ethz.ch
Betreff: Re: [R] Import of Access data via RODBC changes column name ("NO" to 
"Expr1014") and the content of the column

On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:



Dear all,

I have some problems with importing data from an Access data base via
RODBC to R. The data base contains several tables, which all are
imported consecutively. One table has a column with column name "NO".
If I run the code attached on the bottom of the mail I get no
complain, but the column name (name of the respective vector of the
data.frame) is "Expr1014" instead of "NO". Additionally the original
column (type
"text") containes "0"s and missings, but the imported column contains
"0"s only (type "int"). If I change the column name in the Access data
base to "NOx", the import works fine with the right name and the same
data.

Previously I generated a tiny Access data base which reproduced the
problem. To be on the safe site I installed the latest version (2.5.1)
and now the example works fine, but within my production process the
error still remaines. An import into excel via ODBC works fine.

So there is no way to figure it out whether this is a bug or a
feature.-)


It's most likely an ODBC issue, but you have not provided a reproducible 
example.


The second problem I have is that when I rerun "rm(list = ls(all =
T)); gc()" and the import several times I get the following error:

Error in odbcTables(channel) : Calloc could not allocate (263168 of 1)
memory In addition: Warning messages:
1: Reached total allocation of 447Mb: see help(memory.size) in:
odbcQuery(channel, query, rows_at_time)
2: Reached total allocation of 447Mb: see help(memory.size) in:
odbcQuery(channel, query, rows_at_time)
3: Reached total allocation of 447Mb: see help(memory.size) in:
odbcTables(channel)
4: Reached total allocation of 447Mb: see help(memory.size) in:
odbcTables(channel)

which is surprising to me, as the first two statements should delete
all


How do you _know _what they 'should' do?  That only deletes all objects in the 
workspace, not all objects in R, and not all memory blocks used by R.

Please do read ?"Memory-limits" for the possible reasons.

Where did '447Mb' come from?  If this machine has less than 2Gb of RAM, buy 
some more.



objects and recover the memory. Is this only a matter of memory? Is
there any logging that reduces the memory? Or is this issue connected to
the upper problem?

I added the code on the bottom - maybe there is some kind of misuse I
lost sight of. Any hints are appreciated.

Kind regards,
Maciej


version

  _
platform   i386-

Re: [R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column

2007-08-14 Thread Maciej Hoffman-Wecker
Dear Professor Ripley,

Thank you very much for your response. I send the problem, as I didn't have any 
more ideas were to search for the reason. I didn't say this is a R bug, knowing 
the responses on such mails.-) 

But I succeeded in developing a tiny example, that reproduces the bug (wherever 
it is).

I generated a small Access data base "test2.mdb" with one table "Tab1" and 
following columns:

"Field name""Field type"
  F1Number
  NONumber
  F2Number

(sorry if the Access identifiers are not the standard ones, as I have a german 
Access version)

The content of the "Tab1" table is:

F1  NO  F2
1   1   1
2   2   2
0   1
1   0   0

(The column "NO" contains one missing)

Now if I import the data into R, I get the following results:

> library(RODBC)
> .con <- odbcConnectAccess("./test2.mdb")
> (.d <- try(sqlQuery(.con, "select * from Tab1")))
  F1 NO F2
1  1  1  1
2  2  2  2
3  0 NA  1
4  1  0  0
> (.d <- try(sqlQuery(.con, "select F1 , NO , F2 from Tab1")))
  F1 Expr1001 F2
1  10  1
2  20  2
3  00  1
4  10  0
> close(.con)

So the problem occurs if the column names are specified within the query.
Is the query "select F1 , NO , F2 from Tab1" invalid?

Regarding the memory issue, I _knew_ that there must be a reason for the 
running out of memory space. Sorry for not being more specific. My question 
than is:

Is there a way to 'reset' the environment without quitting R and restarting it?

Thank you for your help. 

Kind regards,
Maciej


-Ursprüngliche Nachricht-
Von: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
Gesendet: Dienstag, 14. August 2007 11:51
An: Maciej Hoffman-Wecker
Cc: r-help@stat.math.ethz.ch
Betreff: Re: [R] Import of Access data via RODBC changes column name ("NO" to 
"Expr1014") and the content of the column

On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:

>
> Dear all,
>
> I have some problems with importing data from an Access data base via 
> RODBC to R. The data base contains several tables, which all are 
> imported consecutively. One table has a column with column name "NO". 
> If I run the code attached on the bottom of the mail I get no 
> complain, but the column name (name of the respective vector of the 
> data.frame) is "Expr1014" instead of "NO". Additionally the original 
> column (type
> "text") containes "0"s and missings, but the imported column contains 
> "0"s only (type "int"). If I change the column name in the Access data 
> base to "NOx", the import works fine with the right name and the same 
> data.
>
> Previously I generated a tiny Access data base which reproduced the 
> problem. To be on the safe site I installed the latest version (2.5.1) 
> and now the example works fine, but within my production process the 
> error still remaines. An import into excel via ODBC works fine.
>
> So there is no way to figure it out whether this is a bug or a
> feature.-)

It's most likely an ODBC issue, but you have not provided a reproducible 
example.

> The second problem I have is that when I rerun "rm(list = ls(all = 
> T)); gc()" and the import several times I get the following error:
>
> Error in odbcTables(channel) : Calloc could not allocate (263168 of 1) 
> memory In addition: Warning messages:
> 1: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcQuery(channel, query, rows_at_time)
> 2: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcQuery(channel, query, rows_at_time)
> 3: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcTables(channel)
> 4: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcTables(channel)
>
> which is surprising to me, as the first two statements should delete 
> all

How do you _know _what they 'should' do?  That only deletes all objects in the 
workspace, not all objects in R, and not all memory blocks used by R.

Please do read ?"Memory-limits" for the possible reasons.

Where did '447Mb' come from?  If this machine has less than 2Gb of RAM, buy 
some more.


> objects and recover the memory. Is this only a matter of memory? Is
> there any logging that reduces the memory? Or is this issue connected to
> the upper problem?
>
> I added the code on the bottom - maybe there is some kind of misuse I
> lost sight of. Any hints are appreciated.
>
> Kind regards,
> Maciej
>
>> version
>   _
> platform   i386-pc-mingw32
> arch   i386
> os mingw32

Re: [R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column

2007-08-14 Thread Prof Brian Ripley
On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:

>
> Dear all,
>
> I have some problems with importing data from an Access data base via
> RODBC to R. The data base contains several tables, which all are
> imported consecutively. One table has a column with column name "NO". If
> I run the code attached on the bottom of the mail I get no complain, but
> the column name (name of the respective vector of the data.frame) is
> "Expr1014" instead of "NO". Additionally the original column (type
> "text") containes "0"s and missings, but the imported column contains
> "0"s only (type "int"). If I change the column name in the Access data
> base to "NOx", the import works fine with the right name and the same
> data.
>
> Previously I generated a tiny Access data base which reproduced the
> problem. To be on the safe site I installed the latest version (2.5.1)
> and now the example works fine, but within my production process the
> error still remaines. An import into excel via ODBC works fine.
>
> So there is no way to figure it out whether this is a bug or a
> feature.-)

It's most likely an ODBC issue, but you have not provided a reproducible 
example.

> The second problem I have is that when I rerun "rm(list = ls(all = T));
> gc()" and the import several times I get the following error:
>
> Error in odbcTables(channel) : Calloc could not allocate (263168 of 1)
> memory
> In addition: Warning messages:
> 1: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcQuery(channel, query, rows_at_time)
> 2: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcQuery(channel, query, rows_at_time)
> 3: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcTables(channel)
> 4: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcTables(channel)
>
> which is surprising to me, as the first two statements should delete all

How do you _know _what they 'should' do?  That only deletes all objects in 
the workspace, not all objects in R, and not all memory blocks used by R.

Please do read ?"Memory-limits" for the possible reasons.

Where did '447Mb' come from?  If this machine has less than 2Gb of RAM, 
buy some more.


> objects and recover the memory. Is this only a matter of memory? Is
> there any logging that reduces the memory? Or is this issue connected to
> the upper problem?
>
> I added the code on the bottom - maybe there is some kind of misuse I
> lost sight of. Any hints are appreciated.
>
> Kind regards,
> Maciej
>
>> version
>   _
> platform   i386-pc-mingw32
> arch   i386
> os mingw32
> system i386, mingw32
> status
> major  2
> minor  5.1
> year   2007
> month  06
> day27
> svn rev42083
> language   R
> version.string R version 2.5.1 (2007-06-27)
>
>
> ## code
>
> get.table <- function(name, db, drop = NULL){
>  .con <- try(odbcConnectAccess(db), silent = T)
>  if(!inherits(.con, "RODBC")) return(.con)
>  ## exclude memo columns
>  .t <- try(sqlColumns(.con, name))
>  if(inherits(.t, "try-error")){close(.con); return(.t)}
>  .t <- .t[.t$"COLUMN_SIZE" < 255, "COLUMN_NAME"]
>  .t <- paste(.t, collapse = ",")
>  ## get table
>  .t <- paste("select", .t, "from", name)
>  .d <- try(sqlQuery(.con, .t), silent = T)
>  if(inherits(.d, "try-error")){close(.con); return(.d)}
>  .con <- try(close(.con), silent = T)
>  if(inherits(.con, "try-error")) return(.con)
>  .d <- .d[!names(.d) %in% drop]
>  return(.d)
> }
>
> get.alltables <- function(db){
>  .con <- try(odbcConnectAccess(db), silent = T)
>  if(!inherits(.con, "RODBC")) return(.con)
>  .tbls <- try(sqlTables(.con)[["TABLE_NAME"]])
>  if(inherits(.tbls, "try-error")){close(.con); return(.tbls)}
>  .con <- try(close(.con), silent = T)
>  if(inherits(.con, "try-error")) return(.con)
>  .tbls <- .tbls[-grep("^MSys", .tbls)]
>  .d <- lapply(seq(along = .tbls), function(.i){
>.d <-
>  try(get.table(.tbls[.i], db = db))
>return(invisible(.d))
>  })
>  names(.d) <- .tbls
>  .ok <- !sapply(.d, inherits, "try-error")
>  return(list(notdone = .d[!.ok], data = .d[.ok]))
> }
>
> library(RODBC)
>
> alldata <- get.alltables(db = "./myaccessdb.MDB")
>
> ## code end
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://

[R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column

2007-08-14 Thread Maciej Hoffman-Wecker

Dear all,
 
I have some problems with importing data from an Access data base via
RODBC to R. The data base contains several tables, which all are
imported consecutively. One table has a column with column name "NO". If
I run the code attached on the bottom of the mail I get no complain, but
the column name (name of the respective vector of the data.frame) is
"Expr1014" instead of "NO". Additionally the original column (type
"text") containes "0"s and missings, but the imported column contains
"0"s only (type "int"). If I change the column name in the Access data
base to "NOx", the import works fine with the right name and the same
data. 
 
Previously I generated a tiny Access data base which reproduced the
problem. To be on the safe site I installed the latest version (2.5.1)
and now the example works fine, but within my production process the
error still remaines. An import into excel via ODBC works fine.

So there is no way to figure it out whether this is a bug or a
feature.-)
 
The second problem I have is that when I rerun "rm(list = ls(all = T));
gc()" and the import several times I get the following error:
 
Error in odbcTables(channel) : Calloc could not allocate (263168 of 1)
memory
In addition: Warning messages:
1: Reached total allocation of 447Mb: see help(memory.size) in:
odbcQuery(channel, query, rows_at_time) 
2: Reached total allocation of 447Mb: see help(memory.size) in:
odbcQuery(channel, query, rows_at_time) 
3: Reached total allocation of 447Mb: see help(memory.size) in:
odbcTables(channel) 
4: Reached total allocation of 447Mb: see help(memory.size) in:
odbcTables(channel) 

which is surprising to me, as the first two statements should delete all
objects and recover the memory. Is this only a matter of memory? Is
there any logging that reduces the memory? Or is this issue connected to
the upper problem?
 
I added the code on the bottom - maybe there is some kind of misuse I
lost sight of. Any hints are appreciated.
 
Kind regards,
Maciej
 
> version
   _   
platform   i386-pc-mingw32 
arch   i386
os mingw32 
system i386, mingw32   
status 
major  2   
minor  5.1 
year   2007
month  06  
day27  
svn rev42083   
language   R   
version.string R version 2.5.1 (2007-06-27)

 
## code
 
get.table <- function(name, db, drop = NULL){
  .con <- try(odbcConnectAccess(db), silent = T)
  if(!inherits(.con, "RODBC")) return(.con)
  ## exclude memo columns
  .t <- try(sqlColumns(.con, name))
  if(inherits(.t, "try-error")){close(.con); return(.t)}
  .t <- .t[.t$"COLUMN_SIZE" < 255, "COLUMN_NAME"]
  .t <- paste(.t, collapse = ",")
  ## get table
  .t <- paste("select", .t, "from", name)
  .d <- try(sqlQuery(.con, .t), silent = T)
  if(inherits(.d, "try-error")){close(.con); return(.d)}
  .con <- try(close(.con), silent = T)
  if(inherits(.con, "try-error")) return(.con)
  .d <- .d[!names(.d) %in% drop]
  return(.d)
}
 
get.alltables <- function(db){
  .con <- try(odbcConnectAccess(db), silent = T)
  if(!inherits(.con, "RODBC")) return(.con)
  .tbls <- try(sqlTables(.con)[["TABLE_NAME"]])
  if(inherits(.tbls, "try-error")){close(.con); return(.tbls)}
  .con <- try(close(.con), silent = T)
  if(inherits(.con, "try-error")) return(.con)
  .tbls <- .tbls[-grep("^MSys", .tbls)]
  .d <- lapply(seq(along = .tbls), function(.i){
.d <-
  try(get.table(.tbls[.i], db = db)) 
return(invisible(.d))
  })
  names(.d) <- .tbls
  .ok <- !sapply(.d, inherits, "try-error")
  return(list(notdone = .d[!.ok], data = .d[.ok]))
}
 
library(RODBC)
 
alldata <- get.alltables(db = "./myaccessdb.MDB")
 
## code end

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.