Brian, thank you. That also works and I hadn't considered it but the
peformance is not so great on a large string:

   timespacex 'str(<@[{&>~([<@:+i.@])/"1@:]) slices'
2.44429 6.03512e8
   timespacex '(str ];.0~ ,.)"1 slices'
0.385588 3.14331e8

   $slices
2000000 2
   $str
10000000

I don't know if cut ;.3 would help. I don't really understand the nuvoc
entry or dictionary for it (
http://www.jsoftware.com/jwiki/Vocabulary/semidot3#dyadic or
http://www.jsoftware.com/help/dictionary/d331.htm)

It seems like a special case could be written if it doesn't exist for this
scenario.

Using a 67MB string causes more than 2GB of memory use (out of memory
error) with :(str ];.0~ ,.)"1 slices ... It's really fast when memory isn't
an issue, even though it's calling jtcut02 for each row


Back to our simple example:

We could write some imperative code that would port well to C:

subarrays2=: 4 : 0
max=: >./ 1{"1 y
rows=: {. $ y
ret=: (rows,max) $ ' '
row=:0
for_i. i. rows do.
 'start len'=: i { y
  col=:0
  for_n. start+i.len do.
    ret=:(n { x) (<row, col) } ret
    col=:>:col
  end.
  row=:>:row
end.
ret
)

str=: 'ab|111*cd|222'
slices=:(0,2),(3,3),(7,2),:(10,3)
 ](str subarrays2 slices)
ab
111
cd
222


This loop has no memory pressure but is extremely slow in J. It completes
in about 2 minutes on 10 million rows of data. I suspect it would be very
fast in C since the array can be pre-allocated and two loops.

I could always patch my J binary with it but was wondering if something
already existed.





On Tue, Dec 9, 2014 at 1:27 PM, Brian Schott <[email protected]> wrote:

> Joe,
>
> Is this relevant?
>
> str(<@[{&>~([<@:+i.@])/"1@:]) (0,2),(3,3),(7,2),:(10,3)
>
>
>
>
> --
> (B=)
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to