If you want to inspect data and especially if there are invisible chars it
can be good to use a. i.
Like here when I copied text from the message:
a=. ('1,2,"embedded, comma",3.4',CR,LF,'5,6,"no comma",7.8')
|spelling error
| a=. ('1,2,"embedded, comma",3.4',CR,LF,'5,6,"no comma",7.8')
| ^
a=.' (''1,2,"embedded, comma",3.4'',CR,LF,''5,6,"no comma",7.8'') '
a
('1,2,"embedded, comma",3.4',CR,LF,'5,6,"no comma",7.8')
a. i. a
194 160 40 39 49 44 50 44 34 101 109 98 101 100 100 101 100 44 32 99 111
109 109 97 34 44 51 46 52 39 44 67 82 44 76 70 44 39 53 44 54 44 34 110 111
32 99 111 109 109 97 34 44 55 46 56 39 41 32
On Dec 11, 2013 4:33 PM, "Jon Hough" <[email protected]> wrote:
> Thanks for the replies. It's going to take a while to take all this in.
>
> Regards,
> Jon
>
> > Date: Tue, 10 Dec 2013 11:38:30 -0500
> > From: [email protected]
> > To: [email protected]
> > Subject: Re: [Jprogramming] Beginner Understanding CSV file
> reading/writing
> >
> > You may also want to look at this:
> >
> http://www.jsoftware.com/jwiki/NYCJUG/2012-12-11#Example_of_Free-Form_Text_Wrangling
> .
> >
> >
> > On Tue, Dec 10, 2013 at 11:34 AM, Devon McCormick <[email protected]
> >wrote:
> >
> > > Just to gild the lily, one of our NYCJUG members implemented CSV
> parsing
> > > using J's finite-state machine primitives:
> > >
> http://www.jsoftware.com/jwiki/NYCJUG/2013-06-11?action=AttachFile&do=view&target=Parsing+CSV+Files+with+a+Finite+State+Machine.pdf
> .
> > >
> > >
> > > On Tue, Dec 10, 2013 at 9:35 AM, Joe Bogner <[email protected]>
> wrote:
> > >
> > >> Just to expand on Devon's post, I often use a combination of cut and
> each
> > >> to split up a string
> > >>
> > >> This will do the same (with a few more steps behind the scenes)
> > >>
> > >> > ',' cut each LF cut ('1,2,"embedded comma",3.4',CR, LF,'5,6,"no
> > >> comma",7.8',CR, LF) -. CR
> > >>
> > >> as
> > >>
> > >> <;._1&>',',&.><;._2 CR-.~('1,2,"embedded comma",3.4',CR,LF,'5,6,"no
> > >> comma",7.8',CR,LF)
> > >>
> > >> Jon, in case it helps to break it down:
> > >>
> > >> [Split on comma] [each] [Split on LF] [Remove CR] ('1,2,"embedded
> > >> comma",3.4',CR,LF,'5,6,"no comma",7.8',CR,LF)
> > >>
> > >>
> > >> Step 1 - Remove the extra CR
> > >>
> > >> CR-. removes extra carriage returns from the string. They are
> unnecessary
> > >> since we are splitting on LF
> > >>
> > >> You can accomplish the same by doing this:
> > >>
> > >> ('1,2,"embedded comma",3.4',CR,LF,'5,6,"no comma",7.8',CR,LF) -. CR
> > >>
> > >> As Brian mentioned, the tilde just reverses the arguments.
> > >>
> > >> CR -.~ ('1,2,"embedded comma",3.4',CR,LF,'5,6,"no comma",7.8',CR,LF)
> > >>
> > >> Step 2 - Split on the last character, which is now LF
> > >>
> > >> http://www.jsoftware.com/jwiki/Vocabulary/semidot
> > >>
> > >> <;._2 will split on the last character of the string and drop it
> > >>
> > >> <;._2 ('A',LF,'B',LF,'C',LF)
> > >> ┌─┬─┬─┐
> > >> │A│B│C│
> > >> └─┴─┴─┘
> > >>
> > >> If you check out the definition of 'cut' you will see it has this same
> > >> operation
> > >>
> > >> Step 3 - Split on comma for each item
> > >>
> > >> In Step 2 - we created a boxed array of strings for each LF. We now
> need
> > >> to
> > >> operate on each box and split based on comma
> > >>
> > >> The 'each' adverb will do this, which is what Devon has as "&.>"
> > >>
> > >> [Split on comma] is <;._1&>',' ,
> > >>
> > >> You can see it in action here:
> > >>
> > >> <;._1&>',' , each ('a,b';'c,d')
> > >> ┌─┬─┐
> > >> │a│b│
> > >> ├─┼─┤
> > >> │c│d│
> > >> └─┴─┘
> > >>
> > >> The trick here is to use the cut conjunction to split on commas. The
> split
> > >> conjunction either uses the first or the last item in the array to
> split.
> > >> A
> > >> CSV file won't have the comma at the beginning or the end, so we need
> to
> > >> first add a comma at the beginning of each boxed array so we can tell
> cut
> > >> to split on it
> > >>
> > >> That is what &>',' is doing. It's adding a comma at the beginning of
> each
> > >> item
> > >>
> > >> ',' ,&.> ('a,b';'c,d')
> > >> ┌────┬────┐
> > >> │,a,b│,c,d│
> > >> └────┴────┘
> > >>
> > >> ',' , each ('a,b';'c,d')
> > >>
> > >> ┌────┬────┐
> > >> │,a,b│,c,d│
> > >> └────┴────┘
> > >>
> > >>
> > >> Now that each boxed string starts with a comma, we can cut on the
> first
> > >> character and drop it
> > >>
> > >> <;._1 &> ',' , each ('a,b';'c,d')
> > >>
> > >>
> > >> Back to the beginning:
> > >>
> > >> <;._1 &> ',' , each <;._2 ('1,2,"embedded
> comma",3.4',CR,LF,'5,6,"no
> > >> comma",7.8',CR,LF)
> > >>
> > >> Split on comma - for each item - in a LF split string
> > >>
> > >> ┌─┬─┬────────────────┬────┐
> > >> │1│2│"embedded comma"│3.4 │
> > >> ├─┼─┼────────────────┼────┤
> > >> │5│6│"no comma" │7.8 │
> > >> └─┴─┴────────────────┴────┘
> > >>
> > >>
> > >> Hope that helps. I learned more by going through it and wanted to
> share
> > >>
> > >> On Sat, Dec 7, 2013 at 5:44 PM, Devon McCormick <[email protected]>
> > >> wrote:
> > >>
> > >> > Yes - sorry for typing it in w/o testing it. Note that the point at
> > >> which
> > >> > the error was picked up is indicated by extra spaces in the returned
> > >> line:
> > >> > mat=.<.; _1&>',',&.><;._2 CR-.~freads jpath'~temp/test.csv'
> > >> > |domain error
> > >> > | mat=.<.; _1&>',',&.><;._2 CR-.~freads jpath'~temp/test.csv'
> > >> >
> > >> > A good way to to debug a line like this is to look at successively
> > >> longer
> > >> > pieces, starting w/the rightmost one, e.g. (on my system):
> > >> > jpath '~temp/test.csv'
> > >> > c:/users/devonmcc/j64-701-user/temp/test.csv
> > >> >
> > >> > Do I have this file?
> > >> > fexist jpath '~temp/test.csv'
> > >> > 0
> > >> >
> > >> > So, I don't have this file - I only used it to mimic the example you
> > >> sent.
> > >> > If I create this file locally so I can continue looking at longer
> > >> pieces:
> > >> > ('1,2,"embedded, comma",3.4',CR,LF,'5,6,"no comma",7.8') fwrite
> > >> > 'test.csv'
> > >> > 45
> > >> > fexist 'test.csv'
> > >> > 1
> > >> >
> > >> > BTW - "fexist" is defined
> > >> > fexist=: 1:@(1!:4) ::0:@(([: < 8 u: >) ::]&>)@(<^:(L. = 0:))
> > >> > in case you don't have it.
> > >> >
> > >> > Continuing with longer fragments shows us what the data looks like
> at
> > >> each
> > >> > step:
> > >> > NB. mat=. <.;_1&>',',&.><;._2 CR-.~freads 'test.csv'
> > >> > freads 'test.csv'
> > >> > 1,2,"embedded, comma",3.4
> > >> > 5,6,"no comma",7.8
> > >> >
> > >> > CR-.~freads 'test.csv'
> > >> > 1,2,"embedded, comma",3.4
> > >> > 5,6,"no comma",7.8
> > >> >
> > >> > <;._2 CR-.~freads 'test.csv'
> > >> > +-------------------------+------------------+
> > >> > |1,2,"embedded, comma",3.4|5,6,"no comma",7.8|
> > >> > +-------------------------+------------------+
> > >> > ',',&.><;._2 CR-.~freads 'test.csv'
> > >> > +--------------------------+-------------------+
> > >> > |,1,2,"embedded, comma",3.4|,5,6,"no comma",7.8|
> > >> > +--------------------------+-------------------+
> > >> > <.;_1&>',',&.><;._2 CR-.~freads 'test.csv'
> > >> > |domain error
> > >> > | <.; _1&>',',&.><;._2 CR-.~freads'test.csv'
> > >> >
> > >> > Fixing the error:
> > >> > <;._1&>',',&.><;._2 CR-.~freads 'test.csv'
> > >> > +-+-+----------+-------+---+
> > >> > |1|2|"embedded | comma"|3.4|
> > >> > +-+-+----------+-------+---+
> > >> > |5|6|"no comma"|7.8 | |
> > >> > +-+-+----------+-------+---+
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> >
> > >> > On Sat, Dec 7, 2013 at 10:27 AM, Brian Schott <
> [email protected]
> > >> > >wrote:
> > >> >
> > >> > > It looks like there is a typo in command with `mat`: .; should
> be ;.
> > >> .
> > >> > > 'mat` is not a verb but a noun, btw.
> > >> > > I think tilde is a dyadic tilde, not monadic and swaps the
> arguments
> > >> of
> > >> > -.
> > >> > > in this case.
> > >> > >
> > >> > > On Sat, Dec 7, 2013 at 9:08 AM, Jon Hough <[email protected]>
> > >> wrote:
> > >> > >
> > >> > > > I'd like to thank everyone for replying.
> > >> > > > I suppose I should think about using J7.
> > >> > > >
> > >> > > > I did try Devon's example:
> > >> > > > "You can read CSV files in J pretty simply without using any
> > >> predefined
> > >> > > > verbs like this:
> > >> > > >
> > >> > > > mat=. <.;_1&>',',&.><;._2 CR-.~freads jpath '~temp/test.csv'
> > >> > > >
> > >> > > > and I got the error:
> > >> > > > |domain error
> > >> > > > | mat=.<.; _1&>',',&.><;._2 CR-.~freads
> jpath'~temp/test.csv'
> > >> > > >
> > >> > > > As an aside, I don't really understand what the "mat" function
> is
> > >> > doing.
> > >> > > > I'm still reading
> > >> > > > "J for C Programmers" so my understanding is a little shaky,
> but mat
> > >> > > seems
> > >> > > > to be monadic, with the argument as the file to read. I'm not
> sure
> > >> if
> > >> > > this
> > >> > > > is an example of a tacit verb, because the argument
> > >> ('~temp/test.csv')
> > >> > > > seems to be hardcoded into the verb.
> > >> > > >
> > >> > > > I assume:
> > >> > > > freads jpath '~temp/test.csv'
> > >> > > > reads the file.(http://www.jsoftware.com/user/script_files.htm)
> > >> > > > I do not really understand this: ~freads (I do not understand
> this
> > >> use
> > >> > of
> > >> > > > the monadic tilde)
> > >> > > > I am trying to read this verb from right to left, but am not
> getting
> > >> > very
> > >> > > > far, even using the J dictionary and reference card for support.
> > >> > > > I would really appreciate any help at all in deciphering this.
> > >> > > >
> > >> > > > Thanks and regards,
> > >> > > > Jon
> > >> > > >
> > >> > > >
> > >> > > --
> > >> > > (B=) <-----my sig
> > >> > > Brian Schott
> > >> > >
> ----------------------------------------------------------------------
> > >> > > For information about J forums see
> > >> http://www.jsoftware.com/forums.htm
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > Devon McCormick, CFA
> > >> >
> ----------------------------------------------------------------------
> > >> > For information about J forums see
> http://www.jsoftware.com/forums.htm
> > >> >
> > >> ----------------------------------------------------------------------
> > >> For information about J forums see
> http://www.jsoftware.com/forums.htm
> > >>
> > >
> > >
> > >
> > > --
> > > Devon McCormick, CFA
> > >
> > >
> >
> >
> > --
> > Devon McCormick, CFA
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm