Re: [R] read.fwf and header
FrPi == François Pinard [EMAIL PROTECTED] on Wed, 1 Nov 2006 20:21:11 -0500 writes: FrPi [Martin Maechler] In my (and probably R-core's) view, read.fwf() should only have to be used for ``legacy data files'' (those times when people used *no* separators in order to save disk space), since nowadays, such data files should automatically have correct separators. FrPi In my day-to-day experience, the main virtue for fixed width format FrPi files is basic, humble legibility, much more than disk space savings. Good point. For this reason, I often prefer tab-delimited data files which are human readable too (and don't need quoting of strings, typically). But also, the read.table() default white space-separated files are very well humanly readable if the column starts are aligned. You do need to quote (..) strings with embedded white space then, but that is very well human-readable if you have a smart editor (such as Emacs ;-) which then automatically colorizes strings differently than the rest of the file entries. However, I think this (human-readibility) only applies to relatively small files. FrPi The FWF files I see have delimiters between fields, FrPi but also embedded space within fields, or at end of FrPi fields, without extraneous quotes. XML markup, CSVs, FrPi quoted fields, etc. are devices meant for helping FrPi machines much more than for helping humans. They FrPi significantly decrease legibility. Humans not only FrPi know better, they decipher fixed width format easily FrPi enough for not really needing hairier devices in FrPi general. FrPi FWF files may be archaic, they are not obsolescent. FrPi They will resist the fashion of the day for FrPi complexity, and survive in the long run. I cannot really oppose this statement, but am not as sure as you seem ;-) Thanks anyway for the thought provoking reply. With regards, Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.fwf and header
How about using a connection and reading the header separate from the data, like this: tmp1 - file('c:/temp/tmp.dat') open(tmp1) my.names - scan(tmp1, nlines=1, what='') new.data-read.fwf(file=tmp1, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 19), header=FALSE) names(new.data) - my.names close(tmp1) -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gregor Gorjanc Sent: Monday, October 30, 2006 3:33 PM To: Daniel Nordlund Cc: r-help@stat.math.ethz.ch Subject: Re: [R] read.fwf and header Daniel Nordlund wrote: Gregor, According to the help for read.fwf, sep needs to be set to a value that occurs only in the header record. I changed the spaces to commas in the header record of your example and used the following syntax and was able to read the file just fine. new.data-read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 19), header=TRUE, sep=',') Hope this is helpful, Dan Thanks Dan! But I have to modfy file first. Not that much of work but still. Regards, Gregor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.fwf and header
Greg Snow wrote: How about using a connection and reading the header separate from the data, like this: tmp1 - file('c:/temp/tmp.dat') open(tmp1) my.names - scan(tmp1, nlines=1, what='') new.data-read.fwf(file=tmp1, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 19), header=FALSE) names(new.data) - my.names close(tmp1) Yes, also possible as has been shown in previous posts. -- Lep pozdrav / With regards, Gregor Gorjanc -- University of Ljubljana PhD student Biotechnical Faculty Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan Groblje 3 mail: gregor.gorjanc at bfro.uni-lj.si SI-1230 Domzale tel: +386 (0)1 72 17 861 Slovenia, Europefax: +386 (0)1 72 17 888 -- One must learn by doing the thing; for though you think you know it, you have no certainty until you try. Sophocles ~ 450 B.C. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.fwf and header
[Martin Maechler] In my (and probably R-core's) view, read.fwf() should only have to be used for ``legacy data files'' (those times when people used *no* separators in order to save disk space), since nowadays, such data files should automatically have correct separators. In my day-to-day experience, the main virtue for fixed width format files is basic, humble legibility, much more than disk space savings. The FWF files I see have delimiters between fields, but also embedded space within fields, or at end of fields, without extraneous quotes. XML markup, CSVs, quoted fields, etc. are devices meant for helping machines much more than for helping humans. They significantly decrease legibility. Humans not only know better, they decipher fixed width format easily enough for not really needing hairier devices in general. FWF files may be archaic, they are not obsolescent. They will resist the fashion of the day for complexity, and survive in the long run. -- François Pinard http://pinard.progiciels-bpi.ca __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.fwf and header
Martin Maechler wrote: Gregor == Gregor Gorjanc [EMAIL PROTECTED] on Mon, 30 Oct 2006 23:33:21 +0100 writes: Gregor Daniel Nordlund wrote: Gregor, According to the help for read.fwf, sep needs to be set to a value that occurs only in the header record. I changed the spaces to commas in the header record of your example and used the following syntax and was able to read the file just fine. new.data-read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 19), header=TRUE, sep=',') Hope this is helpful, Dan Gregor Thanks Dan! But I have to modfy file first. Not that Gregor much of work but still. Yes, but I think it shows read.fwf() should not be extended for even more special cases: In my (and probably R-core's) view, read.fwf() should only have to be used for ``legacy data files'' (those times when people used *no* separators in order to save disk space), since nowadays, such data files should automatically have correct separators. -- Fix the file producing process rather than make read.fwf() unnecessarily more complicated. Thank you for this explanation of your (and probably R-core's) view! I really appreciate such feedback. I do agree that read.fwf is a bit archaic way to import data, but sometimes you can not fix file producing process. Perhaps above explanation and code examples from this thread could be added to read.fwf help page. I can provide a patch if my proposal is sane. Regards, Gregor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.fwf and header
Archaic it may be, but I still have to deal with fixed format data files on a daily basis. David L. Reiner Rho Trading Securities, LLC Chicago IL 60605 312-362-4963 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Martin Maechler Sent: Tuesday, October 31, 2006 1:52 AM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: Re: [R] read.fwf and header snip In my (and probably R-core's) view, read.fwf() should only have to be used for ``legacy data files'' (those times when people used *no* separators in order to save disk space), since nowadays, such data files should automatically have correct separators. -- Fix the file producing process rather than make read.fwf() unnecessarily more complicated. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.fwf and header
I also have to deal with fixed format files from time to time. Generally I have no control over the format in those cases. On 10/31/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Archaic it may be, but I still have to deal with fixed format data files on a daily basis. David L. Reiner Rho Trading Securities, LLC Chicago IL 60605 312-362-4963 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Martin Maechler Sent: Tuesday, October 31, 2006 1:52 AM To: [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Subject: Re: [R] read.fwf and header snip In my (and probably R-core's) view, read.fwf() should only have to be used for ``legacy data files'' (those times when people used *no* separators in order to save disk space), since nowadays, such data files should automatically have correct separators. -- Fix the file producing process rather than make read.fwf() unnecessarily more complicated. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.fwf and header
Hi! I have data (also in attached file) in the following form: num1 num2 num3 int1 fac1 fac2 cha1 cha2 Date POSIXt 11 f q 1900-01-01 1900-01-01 01:01:01 2 1.0 131.5 2 a g r z1900-01-01 01:01:01 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01 5 2.5 829737.4d j u w 1900-01-01 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01 5.5 988481.4 10 j q 1900-01-01 1900-01-01 01:01:01 This is a FWF (fixed width format) file. I can not use read.table here, because of missing values. I have tried with the following read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20), header=TRUE) Error in read.table(file = FILE, header = header, sep = sep, as.is = as.is, : more columns than column names I could use: read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20), header=FALSE, skip=1) V1 V2V3 V4 V5 V6 V7 V8 V9 V10 1 1 NANA 1f q 1900-01-01 1900-01-01 01:01:01 2 2 1.0 131.5 2 a g r z 1900-01-01 01:01:01 3 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01 4 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01 5 5 2.5 829737.4 NA d j u w 1900-01-01 6 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01 7 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01 8 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01 9 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01 10 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01 11 NA 5.5 988481.4 10 jq 1900-01-01 1900-01-01 01:01:01 Does anyone have a clue, how to get above result with header? Thanks! -- Lep pozdrav / With regards, Gregor Gorjanc -- University of Ljubljana PhD student Biotechnical Faculty Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan Groblje 3 mail: gregor.gorjanc at bfro.uni-lj.si SI-1230 Domzale tel: +386 (0)1 72 17 861 Slovenia, Europefax: +386 (0)1 72 17 888 -- One must learn by doing the thing; for though you think you know it, you have no certainty until you try. Sophocles ~ 450 B.C. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.fwf and header
On Mon, 2006-10-30 at 19:51 +0100, Gregor Gorjanc wrote: Hi! I have data (also in attached file) in the following form: num1 num2 num3 int1 fac1 fac2 cha1 cha2 Date POSIXt 11 f q 1900-01-01 1900-01-01 01:01:01 2 1.0 131.5 2 a g r z1900-01-01 01:01:01 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01 5 2.5 829737.4d j u w 1900-01-01 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01 5.5 988481.4 10 j q 1900-01-01 1900-01-01 01:01:01 This is a FWF (fixed width format) file. I can not use read.table here, because of missing values. I have tried with the following read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20), header=TRUE) Error in read.table(file = FILE, header = header, sep = sep, as.is = as.is, : more columns than column names I could use: read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20), header=FALSE, skip=1) V1 V2V3 V4 V5 V6 V7 V8 V9 V10 1 1 NANA 1f q 1900-01-01 1900-01-01 01:01:01 2 2 1.0 131.5 2 a g r z 1900-01-01 01:01:01 3 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01 4 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01 5 5 2.5 829737.4 NA d j u w 1900-01-01 6 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01 7 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01 8 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01 9 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01 10 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01 11 NA 5.5 988481.4 10 jq 1900-01-01 1900-01-01 01:01:01 Does anyone have a clue, how to get above result with header? Thanks! The attachment did not come through. Perhaps it was too large? Not sure if this is the most efficient way, but how about this: DF - read.fwf(test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20), skip = 1, strip.white = TRUE, col.names = read.table(test.txt, nrow = 1, as.is = TRUE)[1, ]) DF num1 num2 num3 int1 fac1 fac2 cha1 cha2 Date 1 1 NANA1 fq 1900-01-01 2 2 1.0 131.52agrz 3 3 1.5 1188830.53bhsy 1900-01-01 4 4 2.0 1271846.34citx 1900-01-01 5 5 2.5 829737.4 NAdjuw 1900-01-01 6 6 3.0 1240967.35ekvv 1900-01-01 7 7 3.5 919684.46flwu 1900-01-01 8 8 4.0 968214.67gmxt 1900-01-01 9 9 4.5 1232076.48hnys 1900-01-01 10 10 5.0 1141273.49iozr 1900-01-01 11 NA 5.5 988481.4 10j q 1900-01-01 POSIXt 1 1900-01-01 01:01:01 2 1900-01-01 01:01:01 3 1900-01-01 01:01:01 4 1900-01-01 01:01:01 5 NA 6 1900-01-01 01:01:01 7 1900-01-01 01:01:01 8 1900-01-01 01:01:01 9 1900-01-01 01:01:01 10 1900-01-01 01:01:01 11 1900-01-01 01:01:01 Of course, with the limited number of columns, you can always just set colnames(DF) - c(num1, num2, num3, int1, fac1, fac2, cha1, cha2, Date, POSIXt) as a post-import step. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.fwf and header
Gregor, According to the help for read.fwf, sep needs to be set to a value that occurs only in the header record. I changed the spaces to commas in the header record of your example and used the following syntax and was able to read the file just fine. new.data-read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 19), header=TRUE, sep=',') Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gregor Gorjanc Sent: Monday, October 30, 2006 10:52 AM To: r-help@stat.math.ethz.ch Subject: [R] read.fwf and header Hi! I have data (also in attached file) in the following form: num1 num2 num3 int1 fac1 fac2 cha1 cha2 Date POSIXt 11 f q 1900-01-01 1900-01-01 01:01:01 2 1.0 131.5 2 a g r z1900-01-01 01:01:01 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01 5 2.5 829737.4d j u w 1900-01-01 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01 5.5 988481.4 10 j q 1900-01-01 1900-01-01 01:01:01 This is a FWF (fixed width format) file. I can not use read.table here, because of missing values. I have tried with the following read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20), header=TRUE) Error in read.table(file = FILE, header = header, sep = sep, as.is = as.is, : more columns than column names I could use: read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20), header=FALSE, skip=1) V1 V2V3 V4 V5 V6 V7 V8 V9 V10 1 1 NANA 1f q 1900-01-01 1900-01-01 01:01:01 2 2 1.0 131.5 2 a g r z 1900-01-01 01:01:01 3 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01 4 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01 5 5 2.5 829737.4 NA d j u w 1900-01-01 6 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01 7 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01 8 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01 9 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01 10 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01 11 NA 5.5 988481.4 10 jq 1900-01-01 1900-01-01 01:01:01 Does anyone have a clue, how to get above result with header? Thanks! -- Lep pozdrav / With regards, Gregor Gorjanc -- University of Ljubljana PhD student Biotechnical Faculty Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan Groblje 3 mail: gregor.gorjanc at bfro.uni-lj.si SI-1230 Domzale tel: +386 (0)1 72 17 861 Slovenia, Europefax: +386 (0)1 72 17 888 -- One must learn by doing the thing; for though you think you know it, you have no certainty until you try. Sophocles ~ 450 B.C. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.fwf and header
Marc Schwartz wrote: On Mon, 2006-10-30 at 19:51 +0100, Gregor Gorjanc wrote: Hi! I have data (also in attached file) in the following form: num1 num2 num3 int1 fac1 fac2 cha1 cha2 Date POSIXt 11 f q 1900-01-01 1900-01-01 01:01:01 2 1.0 131.5 2 a g r z1900-01-01 01:01:01 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01 5 2.5 829737.4d j u w 1900-01-01 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01 5.5 988481.4 10 j q 1900-01-01 1900-01-01 01:01:01 This is a FWF (fixed width format) file. I can not use read.table here, because of missing values. I have tried with the following read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20), header=TRUE) Error in read.table(file = FILE, header = header, sep = sep, as.is = as.is, : more columns than column names I could use: read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20), header=FALSE, skip=1) V1 V2V3 V4 V5 V6 V7 V8 V9 V10 1 1 NANA 1f q 1900-01-01 1900-01-01 01:01:01 2 2 1.0 131.5 2 a g r z 1900-01-01 01:01:01 3 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01 4 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01 5 5 2.5 829737.4 NA d j u w 1900-01-01 6 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01 7 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01 8 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01 9 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01 10 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01 11 NA 5.5 988481.4 10 jq 1900-01-01 1900-01-01 01:01:01 Does anyone have a clue, how to get above result with header? Thanks! The attachment did not come through. Perhaps it was too large? Not sure if this is the most efficient way, but how about this: DF - read.fwf(test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20), skip = 1, strip.white = TRUE, col.names = read.table(test.txt, nrow = 1, as.is = TRUE)[1, ]) Argh, my fault as I forgot to attach it :( Not sure if this is the most efficient way, but how about this: DF - read.fwf(test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 20), skip = 1, strip.white = TRUE, col.names = read.table(test.txt, nrow = 1, as.is = TRUE)[1, ]) That is a very nice compromise! No need for [1, ], due to nrow=1. Of course, with the limited number of columns, you can always just set colnames(DF) - c(num1, num2, num3, int1, fac1, fac2, cha1, cha2, Date, POSIXt) I fully agree here, but I kind of lack this directly in read.fwf. I hope that someone from R-core is also listening to this ;) Thank you! Gregor num1 num2 num3 int1 fac1 fac2 cha1 cha2 Date POSIXt 11 f q 1900-01-01 1900-01-01 01:01:01 2 1.0 131.5 2 a g r z1900-01-01 01:01:01 3 1.5 1188830.5 3 b h s y 1900-01-01 1900-01-01 01:01:01 4 2.0 1271846.3 4 c i t x 1900-01-01 1900-01-01 01:01:01 5 2.5 829737.4d j u w 1900-01-01 6 3.0 1240967.3 5 e k v v 1900-01-01 1900-01-01 01:01:01 7 3.5 919684.4 6 f l w u 1900-01-01 1900-01-01 01:01:01 8 4.0 968214.6 7 g m x t 1900-01-01 1900-01-01 01:01:01 9 4.5 1232076.4 8 h n y s 1900-01-01 1900-01-01 01:01:01 10 5.0 1141273.4 9 i o z r 1900-01-01 1900-01-01 01:01:01 5.5 988481.4 10 j q 1900-01-01 1900-01-01 01:01:01 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.fwf and header
Daniel Nordlund wrote: Gregor, According to the help for read.fwf, sep needs to be set to a value that occurs only in the header record. I changed the spaces to commas in the header record of your example and used the following syntax and was able to read the file just fine. new.data-read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 19), header=TRUE, sep=',') Hope this is helpful, Dan Thanks Dan! But I have to modfy file first. Not that much of work but still. Regards, Gregor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.fwf and header
Gregor == Gregor Gorjanc [EMAIL PROTECTED] on Mon, 30 Oct 2006 23:33:21 +0100 writes: Gregor Daniel Nordlund wrote: Gregor, According to the help for read.fwf, sep needs to be set to a value that occurs only in the header record. I changed the spaces to commas in the header record of your example and used the following syntax and was able to read the file just fine. new.data-read.fwf(file=test.txt, widths=c(3, 4, 10, 3, 2, 2, 2, 2, 11, 19), header=TRUE, sep=',') Hope this is helpful, Dan Gregor Thanks Dan! But I have to modfy file first. Not that Gregor much of work but still. Yes, but I think it shows read.fwf() should not be extended for even more special cases: In my (and probably R-core's) view, read.fwf() should only have to be used for ``legacy data files'' (those times when people used *no* separators in order to save disk space), since nowadays, such data files should automatically have correct separators. -- Fix the file producing process rather than make read.fwf() unnecessarily more complicated. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.