---PackRat wrote: > Ah, if life were only so simple! In subsequently working with the > actual data I need to import, I discovered there were several > variations in the format of the data, which creates some > complications > in creating a more general solution. My general question is > whether I > need to have multiple scripts to handle each variation, or is > there a J > way to accommodate the variations so that the J vector/array will end > up being the same, regardless of the input variation?
The short answer to this is "Yes", but how to do this will depend on the formats of the different variations. Ideally you will be able to come up with a methodology that will work for all your variations, but in a worst case scenario you should be able to write recognise the type of file and use a select. case. construct to handle each separately. > > Frankly, I don't remember why I added ". to that statement: > list1 =: ". 'm' fread < 'C:\rfile1.txt' > Maybe seeing an example or something?? As I recall, it had > to do with converting characters to numeric values Yes ". will convert a literal number to a numeric one, but the dyadic version is faster and more specific. See http://www.jsoftware.com/jwiki/Guides/General_FAQ/Numbers_and_Character_ Representations > I *do* know that I had to add x: because the numbers being > read in were > long: they were 14-digit library barcodes. What's > interesting is that > *ALL* the documentation says J will handle up to about 16 digits > without flipping over to exponential notation, yet it failed already > with 14 digits. As I said, interesting. I think you can get around this by increasing the print precision (Edit|Configure|Parameters), but for your situation you may be better off working with the numbers as text. > Well, *this* presented several challenges! Since the earlier > file also > contained characters, I thought J would handle this data if I > went back > to the "non-numeric" reading of data: > list1 =: 'm' fread < 'C:\rfile1.txt' > > However, apparently 'm' requires *numeric* data?? No, 'm' fread 'c:\rfile1.txt' will work fine with literal data. > I can't seem to > find verbs in J that are equivalent to the following Visual > Basic-like > commands: StringLeft StringRight StringMid See Bill Lam's reply > Another question I see coming up shortly is how do I get J to accept > the fact that a terminating "x" or "X" (in the above numbers, for > example, or in book ISBNs) is a valid "numeric" character, being the > result of a base-11 check-digit algorithmic calculation? Or > do I have > to consider these "numbers" as *strings* (of characters) instead? Probably possible to convert this to a base 10 number if required but depending on your needs it could just be left as text. > And, if I need to think/program in terms of strings (I presume this > means boxed data?), will the set operations above work on boxed data, > too, or are other definitions needed for boxed textual data? Boxing strings is most useful for strings of unequal lengths. >(My earlier experiments with this script seemed to indicated > that you couldn't sort boxed data or perform set operations on the > boxed data. You can sort boxed data. /:~ '"b39928282"';'"b29392209"';'"b52343345"' +-----------+-----------+-----------+ |"b29392209"|"b39928282"|"b52343345"| +-----------+-----------+-----------+ or the rows of a text array ]tmp=. >'"b3992828x"';'"b29392203"';'"b52343343"' "b3992828x" "b29392203" "b52343343" /:~tmp "b29392203" "b3992828x" "b52343343" If you want to drop the double quotes in the first and last columns you could do }.@:}:"1 tmp b3992828x b29392203 b52343343 If your values are equal length I'd read the file into a text array (matrix) using tmp=. 'm' fread <filename> If they are unequal length then a better option would be to read the file into a boxed list using tmp=. 'b' fread <filename> > And one more question (for now!) about data massaging in J: it turns > out that another variation in the data is that the first data item in > many of the files I wish to read is a column header such as the > following: > "RECORD #(BIBLIO)" > How can I program J to read a file, *omitting* the first data value? After reading into a noun using either of the above methods, you can drop the first one using }. You could test to see if there is a column header as the first record and only drop it if it is, for example: tmp=. (+./'"RECORD' E. {.tmp)}.tmp NB. use with array or tmp=. (+./'"RECORD' E. 0{::tmp)}.tmp NB. use with boxed list One of the things that is nice about J is that many primitives will work with arrays whether they are numeric or literal. ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
