If you are doing many operations on the files it may be beneficial to change
the chars into numbers
At least in my past experience then APL used to work a hell of a lot faster
on numbers than chars
If you only work on the file once then the inversion process is going to be
too heavy
In ADI the inverting was done by a PL/I program when it involved huge files
I have no idea if J is still a lot faster on numbers than chars as APL was
2006/12/11, Dan Bron <[EMAIL PROTECTED]>:
Hello all,
My current project involves processing ~160MB text files (I have one
file per day). With a little trickery and convincing, J performs
admirably fast on these (~12 seconds per file, so about 6 minutes per
month, or somewhat over an hour per year).
But I'm an impatient man. Also, I'd like to practice a little J
evangelism here. So, I'd like to speed up my solution, if I can.
I've made progress on most parts, but one which is soaking a good
chunk of time is turning this:
mockData =: ] ((<. #)~ { ({.~ 1: + #)@:[) 0: , ([EMAIL PROTECTED]
(>[EMAIL PROTECTED]
2r3&*)@:#)
]M=: 10 mockData >;:'ABC DEF GHI JKL MNO PQR STUV WXY Z'
ABC
STUV
PQR
WXY
PQR
WXY
Into this:
drag M
ABC
STUV
PQR
PQR
WXY
WXY
WXY
PQR
WXY
WXY
WXY
That is, I'd like to "drag" the last legitimate observation across the
information gaps that succeed it (up to but not including the next
legitimate observation). The first observation is necessarily
legitimate.
I've tried several approaches, including:
drag0 =: {~ (I. {~ _1 + +/\)@: (+./"1@:~:&' ')
drag1 =: (;@:(<@:(# # 1 {. ]);.1)~ (+./"1@:~:&' '))
drag2 =: (#;.1 # #)~ +./"1@:~:&' '
drag3 =: [: > a: (] {~ (I. {~ _1 + +/\)@:~:)&.:s: ]
And so far drag0 is as fast as I can get. Of note is that each
column to process is stored as a N by 2 character array mapped to a
file. So, as a last resort, I tried:
DRAG =: verb define
good =. 1 + I. +./"1 ' ' ~: COLUMN_NAME
msk =. 0 < lens =. _1 + | 2 -/\ (1+#y),~ good
runs =. good (+ i.)&.>&:(msk&#) lens
for_run. runs do.
r =. ; run
fill =. COLUMN_NAME {~ <: {. r
COLUMN_NAME =. fill r} COLUMN_NAME
end.
i. 0 0
)
To keep all the operations in-place. But, because I have a large
number of small gaps, rather than a small number of large gaps, this
was the slowest solution of all (besides, I'd like to avoid a
explicit/looping solution if I can).
-Dan
PS: Another obstacle preventing me from wowing my colleagues is the
fact that J likes to crash when playing with arrays of mapped nouns.
When I've got a little more time I'll try to work up a minimal bug
report.
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
--
Björn Helgason, Verkfræðingur
Fugl&Fiskur ehf, Þerneyjarsund 23, Box 127
801 Grímsnes ,t-póst: [EMAIL PROTECTED]
Skype: gosiminn, gsm: +3546985532
Landslags og skrúðgarðagerð, gröfuþjónusta
http://groups.google.com/group/J-Programming
Tæknikunnátta höndlar hið flókna, sköpunargáfa er meistari einfaldleikans
góður kennari getur stigið á tær án þess að glansinn fari af skónum
/|_ .-----------------------------------.
,' .\ / | Með léttri lund verður |
,--' _,' | Dagurinn í dag |
/ / | Enn betri en gærdagurinn |
( -. | `-----------------------------------'
| ) | (\_ _/)
(`-. '--.) (='.'=)
`. )----' (")_(")
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm