RE: [R] Re: read dbf files into R
On 29-Sep-04 Vikas Rawal wrote: Is there a linux-based/free command line tool for converting dbf files into txt? It might not be difficult to write one. You basically need to first decipher the file header, after which reading in the database records is straightfoward. DBF file structure (byte counts start at 0): 0: dBASE version number 1- 3: Date of last update (YY MM DD as 3 separate BCD bytes) 4- 7: (binary int) Number of records in file 8- 9: (binary int) Total number of bytes in header (incl final 0Dh) 10-11: (binary int) Number of bytes per record 12-31: Reserved (not needed for reading file) 32-**: Series of 32-byte descriptions, one for each field Last : Carriage-return byte (hex 0D) For each 32-byte field descriptor: 0-10: Name of field (padded with zero 00 bytes) 11: Field type (ASCII letter: C N L D or M) 12-15: Field RAM address (not relevant for reading file) 16: (binary int) Length of field in bytes 17: (binary int) Number of decimal places 18-31: Not usefully informative for reading file So if there are N fields, the header will occupy 32*(N+1)+1 bytes. Thereafter, you can work out the length of each record in bytes from the info in the field descriptors; each record starts with an additional byte which is * if the record is marked for deletion, otherwise (space). There is no delimiter at the end of a field, nor at the end of a record (so use simply byte counts). Then read in that number of bytes, dissect it into fields (according to field lengths), and output the contents (e.g. comma-delimited) into one line of the destination. Repeat until no more records remain. All info in fields is stored as ASCII-coded characters, so can be written straight out once read in. Note that Logical data (always 1 byte) may be ? Y, y, N, n, T, t, F, f and Date data are MMDD (I'm assuming that there are no Memo data, which are not stored in the DBF file but in a separate DBT file). Hope this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 29-Sep-04 Time: 11:43:44 -- XFMail -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Re: read dbf files into R
Is there a linux-based/free command line tool for converting dbf files into txt? How about DBF2CSV 1.0 in http://www.dirfile.com/dbf2csv.htm. This is a perl script. Best wishes. __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Re: read dbf files into R
On 29-Sep-04 Hisaji ONO wrote: Is there a linux-based/free command line tool for converting dbf files into txt? How about DBF2CSV 1.0 in http://www.dirfile.com/dbf2csv.htm. This is a perl script. This seems to work well in extracting the field-names and data from a DBF file. However, it seems also that it wraps every field inside quotation marks, as in ,,,... which will not usually be wanted (this use of quotation marks is strictly speaking only needed in CSV for fields where a comma is part of the field data). However, in unix/linux at least, the resulting CSV file can be mended by piping it through 'tr', as in cat oldCSVfil.csv | tr -d '' newCSVfile.csv (assuming, of course, that you don't have field data with commas inside ... ) Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 29-Sep-04 Time: 13:52:14 -- XFMail -- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Re: read dbf files into R
Vikas Rawal wrote: Is there a linux-based/free command line tool for converting dbf files into txt? Conceptually, it is not a great way of doing things. We have a dbf file with a well defined structure. We convert it into a text file, which has a loose structure, undefined variables types etc. And then we read the text file. I should be much better to directly read a dbf file and use its database structure definition to ensure that data come into R correctly. RODBC route does not seem suitable for my needs. I need to read some 300 files, and combine all the data. Using ODBC would mean that I would have to set up 300 DSNs in the odbc.ini. Or is there a way to set it up from the command line as well? I suppose it must be possible to write a script that will suitably modify odbc.ini file. But that sounds far too complicated. I have been a user of SAS for a long time. This exercise would be done in a flash there. I wish there was a simple way of doing it in R. Don't we have a simple command that will read a dbf file, or in fact, a set of commands that will read common file formats. I see that we can read SAS, STATA and SPSS files. Somebody would have thought of doing the same for dbf. Isn't it? Vikas Vikas Vito Ricci wrote: Hi, read the manual: R Data Import/Export http://cran.r-project.org/doc/manuals/R-data.pdf Another way is to convert .dbf file in .txt and use read.table(), scan() an similar. Best Vito You wrote: I run R on redhat linux. What would be the easiest way to read dbf files into R? Vikas = Diventare costruttori di soluzioni The business of the statistician is to catalyze the scientific learning process. George E. P. Box Visitate il portale http://www.modugno.it/ e in particolare la sezione su Palese http://www.modugno.it/archivio/cat_palese.shtml ___ Scopri Mister Yahoo! - il fantatorneo sul calcio di Yahoo! Sport http://it.seriea.fantasysports.yahoo.com/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Hi Vikas, I use dbf.read from the maptools package available on CRAN. The package itself is intended to read Arcview shapefiles, but dbf.read can read a general dbf and throw it into an R data frame. Because dbf.read's not in maptools' namespace, however, you'll have to access it directly; e.g., ## Non-trivial example: Read ZIP-county mapping file from www.census.gov zipnov - maptools:::dbf.read(/home/kevin/census/zipnov99.DBF) head(zipnov) ZIP_CODE LATITUDE LONGITUDE ZIP_CLASS PONAME STATE COUNTY 100210 +43.005895 -071.013202 U PORTSMOUTH33015 200211 +43.005895 -071.013202 U PORTSMOUTH33015 300212 +43.005895 -071.013202 U PORTSMOUTH33015 400213 +43.005895 -071.013202 U PORTSMOUTH33015 500214 +43.005895 -071.013202 U PORTSMOUTH33015 600215 +43.005895 -071.013202 U PORTSMOUTH33015 From there, you could write it out into a text file: ## Write to csv with usual options set write.table(zipnov, row.names = F, sep = ,, file = zipnov.csv) Let me know if you have any questions. Kevin __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html