Re: [Jprogramming] cut with multiple subarrays

Joe Bogner Tue, 09 Dec 2014 17:41:28 -0800

Henry, thanks. In case others are interested, here's a few examples. NuVoc
talks about it taking a table parameter, "Argument x is a table with two
rows.". http://www.jsoftware.com/jwiki/Vocabulary/semidot0


Does it actually take a brick and a table? I re-read
http://www.jsoftware.com/jwiki/Vocabulary/Nouns#Atom   Glad to have the
NuVoc entry.

Here's an example closer to what I was looking for:

str=: 'ab|111*cd|222'
slices=:(0,2),(3,3),(7,2),:(10,3)

[ (4 2 1 $ ; slices) ;@:(<;.0) str

ab111cd222


and

[ (4 2 1 $ ; slices)  ];.0 str
ab
111
cd
222





On Tue, Dec 9, 2014 at 6:00 PM, Henry Rich <[email protected]> wrote:

> There's some special code which IIRC this uses:
>
> NB. x is a string.  y is a table of (start,length) selections from x
> NB. Result is a string, with the selected parts of x run together.
> NB. Each interval in y must start within the bounds of x, even if the
> length is 0
> substrs =: (;@:(<;.0)~  ,."1)~
>
> Henry Rich
>
>
> On 12/9/2014 2:55 PM, Joe Bogner wrote:
>
>> Brian, thank you. That also works and I hadn't considered it but the
>> peformance is not so great on a large string:
>>
>>     timespacex 'str(<@[{&>~([<@:+i.@])/"1@:]) slices'
>> 2.44429 6.03512e8
>>     timespacex '(str ];.0~ ,.)"1 slices'
>> 0.385588 3.14331e8
>>
>>     $slices
>> 2000000 2
>>     $str
>> 10000000
>>
>> I don't know if cut ;.3 would help. I don't really understand the nuvoc
>> entry or dictionary for it (
>> http://www.jsoftware.com/jwiki/Vocabulary/semidot3#dyadic or
>> http://www.jsoftware.com/help/dictionary/d331.htm)
>>
>> It seems like a special case could be written if it doesn't exist for this
>> scenario.
>>
>> Using a 67MB string causes more than 2GB of memory use (out of memory
>> error) with :(str ];.0~ ,.)"1 slices ... It's really fast when memory
>> isn't
>> an issue, even though it's calling jtcut02 for each row
>>
>>
>> Back to our simple example:
>>
>> We could write some imperative code that would port well to C:
>>
>> subarrays2=: 4 : 0
>> max=: >./ 1{"1 y
>> rows=: {. $ y
>> ret=: (rows,max) $ ' '
>> row=:0
>> for_i. i. rows do.
>>   'start len'=: i { y
>>    col=:0
>>    for_n. start+i.len do.
>>      ret=:(n { x) (<row, col) } ret
>>      col=:>:col
>>    end.
>>    row=:>:row
>> end.
>> ret
>> )
>>
>> str=: 'ab|111*cd|222'
>> slices=:(0,2),(3,3),(7,2),:(10,3)
>>   ](str subarrays2 slices)
>> ab
>> 111
>> cd
>> 222
>>
>>
>> This loop has no memory pressure but is extremely slow in J. It completes
>> in about 2 minutes on 10 million rows of data. I suspect it would be very
>> fast in C since the array can be pre-allocated and two loops.
>>
>> I could always patch my J binary with it but was wondering if something
>> already existed.
>>
>>
>>
>>
>>
>> On Tue, Dec 9, 2014 at 1:27 PM, Brian Schott <[email protected]>
>> wrote:
>>
>>  Joe,
>>>
>>> Is this relevant?
>>>
>>> str(<@[{&>~([<@:+i.@])/"1@:]) (0,2),(3,3),(7,2),:(10,3)
>>>
>>>
>>>
>>>
>>> --
>>> (B=)
>>> ----------------------------------------------------------------------
>>> For information about J forums see http://www.jsoftware.com/forums.htm
>>>
>>>  ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
>>  ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] cut with multiple subarrays

Reply via email to