If you are doing many operations on the files it may be beneficial to change
the chars into numbers
At least in my past experience then APL used to work a hell of a lot faster
on numbers than chars
If you only work on the file once then the inversion process is  going to be
too heavy
In ADI the inverting was done by a PL/I program when it involved huge files
I have no idea if J is still a lot faster on numbers than chars as APL was

2006/12/11, Dan Bron <[EMAIL PROTECTED]>:

Hello all,

My current project involves processing  ~160MB  text files (I have one
file per day).  With a little trickery and convincing, J performs
admirably fast on these (~12 seconds per file, so about 6 minutes per
month, or somewhat over an hour per year).

But I'm an impatient man.  Also, I'd like to practice a little J
evangelism here.  So, I'd like to speed up my solution, if I can.
I've made progress on most parts, but one which is soaking a good
chunk of time is turning this:

           mockData =: ] ((<. #)~ { ({.~ 1: + #)@:[) 0: , ([EMAIL PROTECTED] 
(>[EMAIL PROTECTED]
2r3&*)@:#)
           ]M=: 10 mockData >;:'ABC DEF GHI JKL MNO PQR STUV WXY Z'
        ABC
        STUV
        PQR

        WXY


        PQR
        WXY


Into this:

           drag M
        ABC
        STUV
        PQR
        PQR
        WXY
        WXY
        WXY
        PQR
        WXY
        WXY
        WXY

That is, I'd like to "drag" the last legitimate observation across the
information gaps that succeed it (up to but not including the next
legitimate observation).  The first observation is necessarily
legitimate.

I've tried several approaches, including:

        drag0  =:  {~ (I. {~ _1 + +/\)@:     (+./"1@:~:&' ')
        drag1  =:  (;@:(<@:(# # 1 {. ]);.1)~ (+./"1@:~:&' '))
        drag2  =:  (#;.1 # #)~                +./"1@:~:&' '
       drag3  =:  [: > a: (] {~ (I. {~ _1 + +/\)@:~:)&.:s: ]

And so far  drag0  is as fast as I can get.  Of note is that each
column to process is stored as a N by 2 character array mapped to a
file.  So, as a last resort, I tried:

        DRAG     =:  verb define

           good  =.  1 + I. +./"1 ' ' ~: COLUMN_NAME
           msk   =.  0 < lens =. _1 + | 2 -/\ (1+#y),~ good
           runs  =.  good (+ i.)&.>&:(msk&#) lens

           for_run. runs do.
                        r           =. ; run
                        fill        =. COLUMN_NAME {~ <: {. r
                        COLUMN_NAME =. fill r} COLUMN_NAME
           end.

           i. 0 0
        )

To keep all the operations in-place.  But, because I have a large
number of small gaps, rather than a small number of large gaps, this
was the slowest solution of all (besides, I'd like to avoid a
explicit/looping solution if I can).

-Dan

PS:  Another obstacle preventing me from wowing my colleagues is the
fact that J likes to crash when playing with arrays of mapped nouns.
When I've got a little more time I'll try to work up a minimal bug
report.

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm




--
Björn Helgason, Verkfræðingur
Fugl&Fiskur ehf, Þerneyjarsund 23, Box 127
801 Grímsnes ,t-póst: [EMAIL PROTECTED]
Skype: gosiminn, gsm: +3546985532
Landslags og skrúðgarðagerð, gröfuþjónusta
http://groups.google.com/group/J-Programming


Tæknikunnátta höndlar hið flókna, sköpunargáfa er meistari einfaldleikans

góður kennari getur stigið á tær án þess að glansinn fari af skónum
         /|_      .-----------------------------------.
        ,'  .\  /  | Með léttri lund verður        |
    ,--'    _,'   | Dagurinn í dag                     |
   /       /       | Enn betri en gærdagurinn  |
  (   -.  |        `-----------------------------------'
  |     ) |        (\_ _/)
 (`-.  '--.)       (='.'=)
  `. )----'        (")_(")
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to