Tim,

Coming to this with a bit of a delay, but one of the things that you may wish to try is to set 'rows_at_time = 1' in your calls to sqlQuery(). See the help file for the function.

Prior to RODBC version 1.3-0, this defaulted to 1. However, beginning with 1.3-0, the default changed to 100.

I had problems with the new default value using RODBC connecting to an Oracle server. These were typically queries returning 0 rows, where I knew that there should be more than that and typically in the hundreds. The problem manifested itself when the view names were > 11 or 12 characters. This had suggested the possibility of a pointer gone awry, stepping on and corrupting the query result. However, I could never confirm that and Prof. Ripley could not replicate the problem on his end.

I think that lacking further investigation, it may be representative of a bug in the ODBC driver itself.

Setting 'rows_at_time' to 1 has resolved the problem for me and you should see if that resolves anything in your situation.

BTW, the error message that you get below after loading RODBC on Linux suggests that you need to re-install it. It appears to have been built using a prior version of R, before dynamically generated help files were introduced in recent versions.

HTH,

Marc Schwartz

On Dec 15, 2009, at 2:36 PM, Tim Coote wrote:

Well my fix is clearly inadequate as I'm getting malloc errors when I re-read the information :-(
On 15 Dec 2009, at 17:49, Tim Coote wrote:

I now have a simplistic fix for this error. I have no idea how to create test or submit a patch. Indeed, i've got zero familiarity with the testing model.

Here's the skinny. The RODBC.c driver code does not handle SQL_FLOATs (ie DATA_TYPE 6 in my original posting) returned by the odbc driver, so they fall through to the default encoding mechanism, which is character based (I think). As far as I can see, all that's needed is to include SQL_FLOAT to all of the SQL_DOUBLE cases in three switch statements. (one in cachenbind, two in RODBCFetchRows.)

I'd guess that the reason that I've found this bug on the mac is that my linux box (Fedora Core 10) uses an older version of RODBC (rpm: R-RODBC-1.2-2.fc10.i386) and the bug's crept in between the two versions. However, I've not diff'd the sources. It could just be that no one else uses RODBC + PostgreSQL.

What's the process for submitting a suitable patch? or is the above description enough to be picked up by the package maintainers?

Tim
On 2 Dec 2009, at 22:49, Tim Coote wrote:

For some reason I've got a problem with double precision numbers in postgres coming across into R as integers (they're just truncated).

I'm using:
R 2.10.0 GUI 1.30 Leopard build 32-bit (5511)
R-RODBC version 1.3
I've tried both the packaged postgresql odbc driver for the mac and rebuilt from source with the same results (psqlodbc-08.04.0100), against postgres 8.4 on my mac and against 8.3 on Fedora Linux.

the odbc driver works as expected with python + pyodbc, and with R 2.10, RODBC (R-RODBC-1.2-2.fc10.i386, R-2.10.0-2.fc10.i386).

Here's some noddy results (I create a table called x with with one double precision column) - I'm ignoring the warning, which seems to be a distro error for fc10:

From R on Linux:
> library (RODBC)
Warning message:
package 'RODBC' was built under R version 2.8.1 and help may not work correctly
> c=odbcConnect ("PostgreSQL",  uid="yy", pwd="xxx")
> sqlQuery (c, "select * from x")
    x
1  1.234
2 43.989
> sqlColumns(c, "x")
TABLE_QUALIFIER TABLE_OWNER TABLE_NAME COLUMN_NAME DATA_TYPE TYPE_NAME 1 cases public x x 6 float8 PRECISION LENGTH SCALE RADIX NULLABLE REMARKS COLUMN_DEF SQL_DATA_TYPE 1 15 8 0 10 1 <NA> 6 SQL_DATETIME_SUB CHAR_OCTET_LENGTH ORDINAL_POSITION IS_NULLABLE DISPLAY_SIZE 1 NA NA 1 <NA> 22
FIELD_TYPE AUTO_INCREMENT PHYSICAL NUMBER TABLE OID
1        701              0               1     16517


However, on the mac:
> c=odbcConnect ("Hg", uid="yy", pwd="xxx")
> sqlQuery(c, "select * from x")
x
1  1
2 43
> sqlColumns (c, "x")
TABLE_QUALIFIER TABLE_OWNER TABLE_NAME COLUMN_NAME DATA_TYPE TYPE_NAME PRECISION LENGTH SCALE RADIX NULLABLE REMARKS 1 cases public x x 6 float8 15 8 0 10 1 COLUMN_DEF SQL_DATA_TYPE SQL_DATETIME_SUB CHAR_OCTET_LENGTH ORDINAL_POSITION IS_NULLABLE DISPLAY_SIZE FIELD_TYPE 1 <NA> 6 NA NA 1 <NA> 22 701
AUTO_INCREMENT PHYSICAL NUMBER TABLE OID BASE TYPEID
1              0               1     16517           0

Uid and pwd obfuscated. As far as I can see, the driver on the mac is correctly identifying the type of the column 'x' as a float8, but then truncating before the decimal point. I have seen somewhere that Oracle drivers used to suffer from this issue, when they truncated type 6 variables.

python looks ok on the mac:
>>> import pyodbc
>>> c=pyodbc.connect (dsn="Hg")
>>> cu=c.cursor().execute ("select * from x")
>>> print cu.fetchall()
[(1.234, ), (43.988999999999997, )]

I cannot find a pyodbc equivalent to sqlColumns. As far as I can see, R and python on my mac are using a common stack, and R seems to work ok with the fedora odbc drivers. Is it likely that I've still got bugs in my odbc set up or is there an issue with R + ODBC? I'm not sure where else to look for what's going wrong.

Tim

_______________________________________________
R-SIG-Mac mailing list
R-SIG-Mac@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-mac

Reply via email to