I agree that you need to have more than one cell on hand to do the nub!

However,  try this approach. It's crude, but it might match or beat C:

   - decide how much input you can process at once.
   - Result =. ''
   - while more hunks, loop over:
       -    hunkresult = nub-of-infix on current hunk of the input.
       -    Result = nub Result,hunkresult

This lets J do most of the heavy lifting, while keeping you
within the memory constraints.

Bob

On 15-02-03 03:45 PM, Joe Bogner wrote:
Sorry for the double email, I realized I replied to you directly. I'm
replying to you directly since you replied to me directly, just in
case you didn't want to send it to chat...

In any case, the whole goal is to get the nub of the 9 character
infixes. There is special code for some of the mathematical operations
on infixes (such as sum) that don't generate all the infixes, however
I having found anything for nub (distinct), which makes some sense
since it can't be reduced to an atom like the math operations

Thanks for the ideas


On Tue, Feb 3, 2015 at 3:42 PM, Robert Bernecky
<[email protected]> wrote:
"9 character infixes" - that looks like an input to a convolution, does it
not?
If so, then there should be some J dohickey (adverb or conjunction)
  (My J is very very rusty, and likely obsolete...)
to do that that will invoke your kernel verb without generating all the
infixes.

By convolution here, I mean something like moving-window inner
product, string search, or, err, convolution.

What processing do you apply to each infix?

Bob


On 15-02-03 03:36 PM, Joe Bogner wrote:
Thanks Bob. I agree that compiling often helps. I have previously
kicked around the idea of generating byte code or machine code and
interpreting it or executing it. I have called dlls in other
circumstances, but it would be nice to have everything in J.

In this case, the challenge put a 2GB constraint on memory and 60
seconds of execution time. It's an artificial constraint, which is why
I posted to chat. Those constraint wouldn't exist in the real world.

I'm scanning a 240MB text file and generating 9 character infixes and
processing them. I hit the memory limit using 9]\ so I figured I'd try
looping and using memr (15!:1). In an earlier version, about 270
seconds were spent in the looping construct and 40 seconds in memr. It
seemed like there should be a faster looping construct

Here's a simple version:

txt=: a. {~ (97&+(i. 26))
addr=: mema (# txt)
txt memw addr,0,(#txt)
6!:2 '([:<:([:0&{::(] ; smoutput&([:15!:1(addr,9,~<:@:]))    ))) ^:]
(<:#txt)'


     yz
xyz
wxyz
vwxyz
uvwxyz
tuvwxyz
stuvwxyz
rstuvwxyz
qrstuvwxy
pqrstuvwx
opqrstuvw
nopqrstuv
mnopqrstu
lmnopqrst
klmnopqrs
jklmnopqr
ijklmnopq
hijklmnop
ghijklmno
fghijklmn
efghijklm
defghijkl
cdefghijk
bcdefghij
abcdefghi


I would need more processing of the results, but you can see roughly
how it compares to 9]\

9[\txt
abcdefghi
bcdefghij
cdefghijk
defghijkl
efghijklm
fghijklmn
ghijklmno
hijklmnop
ijklmnopq
jklmnopqr
klmnopqrs
lmnopqrst
mnopqrstu
nopqrstuv
opqrstuvw
pqrstuvwx
qrstuvwxy
rstuvwxyz




On Tue, Feb 3, 2015 at 3:05 PM, Robert Bernecky
<[email protected]> wrote:
Compiling often helps. When I compile looping, scalar-dominated
APL code, I generally get about a factor of 1000X speedup.
You should get the same sort of speedup with compiled J.
[No, I do not have a J compiler, but will be happy to create one
if funding were to materialize.]

However, you mentioned that you were "looping due to memory
constraints...". Is it possible that a different algorithm would let
you do less looping? E.g., looping over 1e4 elements at once,
or using a different approach, such as dynamic programming?

Bob


On 15-02-03 02:53 PM, Joe Bogner wrote:
I'm playing around with a coding challenge that I'm solving with a
loop that needs to execute over 200 million times.

Setting aside whether that's a smart approach, what is the quickest
looping construct in J? I realize that J is not optimized for this
style of programming, but I'm still curious if it can be done
efficiently.

The quickest I've found takes about 7 seconds to iterate 100 million
times, whereas the same loop in C takes 0.03 seconds.

6!:2 '(] [ ])^:] 100e6'
7.00952

The train is taking up some time, but eventually I will need it

No train - 2.7 seconds

6!:2 ']^:] 100e6'
2.70656


Explicit:

loop=: 3 : 0
ctr=:0
while. 1 do.
ctr=:>:ctr
if. y < ctr do.
smoutput 'done'
return.
end.
end.
)

6!:2 'loop 100e6'
122.48

Clearly explicit isn't the way to go

Back to the tacit:

log=:smoutput bind ]

      ([: <: [: 0&{:: ] ; log )^:] 10
10
9
8
7
6
5
4
3
2
1
0

I would replace log with my function that uses the counter value

That's pretty slow though on 100 million iterations - 88 seconds

6!:2 '([: <: [: 0&{:: ] ; ] )^:] 100e6'
88.3644

Let's try a new log

     log=: ] <:@:[ (smoutput bind ])
      log 55
55
54


That looks like it could work. Let's put a dummy in for now

     log=: ] <:@:[ ]

6!:2 'log^:] 100e6'
40.9542

Still 40 seconds... Any ideas on how to speed up iteration like this?


Sidenote: I'm looping due to memory constraints placed on the coding
challenge
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

--
Robert Bernecky
Snake Island Research Inc
18 Fifth Street
Ward's Island
Toronto, Ontario M5J 2B9

[email protected]
tel: +1 416 203 0854


--
Robert Bernecky
Snake Island Research Inc
18 Fifth Street
Ward's Island
Toronto, Ontario M5J 2B9

[email protected]
tel: +1 416 203 0854



--
Robert Bernecky
Snake Island Research Inc
18 Fifth Street
Ward's Island
Toronto, Ontario M5J 2B9

[email protected]
tel: +1 416 203 0854

----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to