Kit Teather wrote:
> I ran some timings on a subset of the data I had been using and results were:
> myself 21.75 secs
> Roger 21.88 secs
> Dan 21.93 secs.
> Fraser 21.93 secs.
> I'll probably therefore stick with my own. I was very interested in the use
> of ^: iterated 0 or 1 time.
I think I understand this mystery now. A loop-free timing for the same
data is likely to run about 0.01 secs.
I don't know exactly what data Kit is working with, and assume it is a
boxed table of formatted numbers. It matters little whether it is
n-dimensional, or whether there is some character data involved. The
latter may slow things down a little, but not significantly.
In the script below, the data is a 60 x 100 table, which gives timings
similar to Kits. Verb savecsv writes this out in csv format. Verb
savecsvkt uses Kit's csvminus verb to format the data, then calls the
csv utility writecsv to write the data. Timing for these two verbs is
approximately 0.01 second and 19 seconds.
There are two problems in Kit's verb. The more serious is that the
result is built up by appending to a boxed list. Actually, for negative
numbers, the append is done twice. Each such append moves the data
around, and is very time consuming. If the data were 100 x 100, the time
would become 80 seconds, so clearly csvminus does not scale well. A
quick fix to do a replace (verb csvminus2) reduces the time to 7 seconds.
The rest of the time in csvminus is spent on the loop. Any way of
avoiding the loop or reducing the amount of looping will speed things
up. In practice, I doubt any looping is needed and that the technique of
savecsv can be used.
require 'csv files'
shp=: 60 100
dat=: shp $ 0j2 ": each o. _15 + ?~30
F=: jpath '~temp/t1.csv'
G=: jpath '~temp/t2.csv'
H=: jpath '~temp/t3.csv'
NB. boxed_table savecsv file
savecsv=: 4 : 0
dat=. ; x. ,each ','
dat=. '-' (I. dat='_') } dat
ndx=. (-{:$x.) {:\ I. dat = ','
dat=. LF ndx } dat
dat fwrites y.
)
csvminus=: 3 : 0
shape=. $y.
numbers=. '0123456789_.abdejprx'
listboxes=. ,y.
new=. 0$0
while.0<#listboxes
do. current=. >{.listboxes
listboxes=. }.listboxes
new=. new,<current
if. -.(*./current e.numbers)
do. continue
elseif. '_'={.current
do. new=. (}:new),<'-',}.current
end.
end.
shape $ new
)
NB. csvminus with replace
csvminus2=: 3 : 0
shape=. $y.
numbers=. '0123456789_.abdejprx'
new=. ,y.
for_d. new do.
current=. >d
if. *./current e.numbers do.
if. '_'={.current
do. new=. (<'-',}.current) d_index } new
end.
end.
end.
shape $ new
)
savecsvkt=: 4 : '(csvminus x.) writecsv y.'
savecsvkt2=: 4 : '(csvminus2 x.) writecsv y.'
timex=: 6!:2
timex 'dat savecsv F'
timex 'dat savecsvkt G'
timex 'dat savecsvkt2 H'
(3 3{. readcsv F) ; (3 3{. readcsv G); <3 3{. readcsv H
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm