Thanks for everything. The utility function readdsv is functionally enough for my purpose. But I noticed that high functionality results in less speed. It takes much time for reading more than 400 million lines with each line of a hundred variables.
Thanks again. Toshinari On 2012/11/10, at 6:39, Devon McCormick <[email protected]> wrote: > Note that "readdsv" takes an optional left argument to specify the > delimiter and quotes character, with a default of (TAB;'"'), e.g. > > readdsv 'testTAB.txt' > +----------------+-+-------------------------+ > |This is entry |1|... more stuff | > +----------------+-+-------------------------+ > |'nuther entry, #|2|... even more stuff | > +----------------+-+-------------------------+ > |This entry - |3| - has embedded TABs | > +----------------+-+-------------------------+ > |This entry, # |4| has embedded commas, yo?| > +----------------+-+-------------------------+ > readdsv 'testComma.txt' > +------------------------------------------+ > |This is entry,0,... more stuff | > +------------------------------------------+ > |'nuther entry, #,1,... even more stuff | > +------------------------------------------+ > |This entry - ,2, - has embedded TABs | > +------------------------------------------+ > |This entry, # ,3, has embedded commas, yo?| > +------------------------------------------+ > (',';'"') readdsv 'testComma.txt' > +----------------+-+-------------------------+ > |This is entry |0|... more stuff | > +----------------+-+-------------------------+ > |'nuther entry, #|1|... even more stuff | > +----------------+-+-------------------------+ > |This entry - |2| - has embedded TABs | > +----------------+-+-------------------------+ > |This entry, # |3| has embedded commas, yo?| > +----------------+-+-------------------------+ > (',';'') readdsv 'testComma.txt' > +----------------+----+----------------------+---------------------+-----+ > |"This is entry" |0 |"... more stuff" | | | > +----------------+----+----------------------+---------------------+-----+ > |"'nuther entry | #" |1 |"... even more stuff"| | > +----------------+----+----------------------+---------------------+-----+ > |"This entry - "|2 |" - has embedded TABs"| | | > +----------------+----+----------------------+---------------------+-----+ > |"This entry | # "|3 |" has embedded commas| yo?"| > +----------------+----+----------------------+---------------------+-----+ > > > On Fri, Nov 9, 2012 at 4:21 PM, Ric Sherlock <[email protected]> wrote: > >> You might want to look at the 'tables/dsv' and 'tables/csv' addons. >> http://www.jsoftware.com/jwiki/Addons/tables/dsv >> http://www.jsoftware.com/jwiki/Addons/tables/csv >> >> The 'tables/csv' addon is basically a special case of the more general >> 'tables/dsv', where the delimiter is set to ',' and the quote >> character is set to '"' . >> >> To read those files would involve: >> require 'tables/csv' >> readdsv 'mytestfile.txt' >> readcsv 'mytestfile.csv' >> >> On Thu, Nov 8, 2012 at 12:00 PM, kamakura <[email protected]> >> wrote: >>> Hi >>> >>> I would like to know J's manipulation for reading text file. >>> >>> fd=:'mytestfile.txt' NB. TAB text file >>> NB. a \t 1 \t 2 \t 3 >>> NB. b \t \t 5 \t 6 >>> NB. c \t 7 \t \t 9 >>> NB. d \t 10 \t 11 \t 12 >>> NB. \t 13 \t 14 \t 15 >>> >>> fd2=:'mytestfile.csv' NB. CSV test file >>> >>> load'files misc' >>> freads fd2 >>> a,1,2,3 >>> b,,5,6 >>> c,7,,9 >>> d,10,11,12 >>> ,13,14,15 >>> >>> >>> freadr fd ;0 1 >>> a 1 2 3 >>> >>> freadr fd ;1 1 >>> b 5 6 >>> >>> freadr fd ;2 1 >>> 7 9 >>> d >>> freadr fd ;3 1 >>> 10 11 1 >>> freadr fd ;4 1 >>> >>> 13 14 >>> >>> The 3rd and 4th rows are read differently. I expect that the following >> line will come out. >>> >>> freadr fd ;2 1 >>> c 7 9 >>> >>> >>> >>> u=:'m' fread fd >>> u >>> a 1 2 3 >>> b 5 6 >>> c 7 9 >>> d 10 11 12 >>> 13 14 15 >>> >>> TAB chop "1 u >>> +--+--+-----+----+ >>> |a |1 |2 |3 | >>> +--+--+-----+----+ >>> |b |5 |6 | | >>> +--+--+-----+----+ >>> |c |7 | |9 | >>> +--+--+-----+----+ >>> |d |10|11 |12 | >>> +--+--+-----+----+ >>> |13|14|15 | | >>> +--+--+-----+----+ >>> >>> >>> How can I get the following table? >>> >>> +--+--+-----+----+ >>> |a |1 |2 |3 | >>> +--+--+-----+----+ >>> |b |NA|5 |6 | >>> +--+--+-----+----+ >>> |c |7 |9 |NA | >>> +--+--+-----+----+ >>> |d |10|11 |12 | >>> +--+--+-----+----+ >>> |NA|13|14 |15 | >>> +--+--+-----+----+ >>> >>> R reads this text file as follows: >>> >>>> u=read.table("mytestfile.txt",header=F,na.strings="",sep="\t") >>>> u >>> V1 V2 V3 V4 >>> 1 a 1 2 3 >>> 2 b NA 5 6 >>> 3 c 7 NA 9 >>> 4 d 10 11 12 >>> 5 <NA> 13 14 15 >>> >>> Do you have any convenient utility function for reading text files? >>> >>> >>> >>> +++++++++++++++++++++++++++++ >>> Toshinari Kamakura >>> >>> Chuo University >>> 1-13-27 Kasuga >>> Bunkyo-ku >>> Tokyo 112-8551, Japan >>> ++++++++++++++++++++++++++++++ >>> >>> >>> >>> ---------------------------------------------------------------------- >>> For information about J forums see http://www.jsoftware.com/forums.htm >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm >> > > > > -- > Devon McCormick, CFA > ^me^ at acm. > org is my > preferred e-mail > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
