Re: [Jprogramming] Apply at start/lengths pairs

Raul Miller Sun, 11 Jun 2017 07:07:28 -0700

I believe readability is the reason ;: was not made more general.
Though efficiency might also have been a part of that.


-- 
Raul

On Sun, Jun 11, 2017 at 9:59 AM, Danil Osipchuk
<[email protected]> wrote:
> I could not find a cutP definition with a quick look, but from your example
> it seems like you mean a character separator by token. It is not general
> enough.
>
> Also, imagine a fluffy xml file, with millions of records, where only a
> minority of fields of different type in records are interesting, some nodes
> have missing fields of the type you are interested in.
> Parsing the whole file is plainly unfeasible because of performance and
> complexity of the resulting code.
>
> Applying at selected positions obtained and reshaped by whatever means
> however works rather well and is easy to reason about.
>
> As about fsm, I do remember that I found ;: inconvenient when I was trying
> to apply it - and one issue was that there is no way to emit an empty word.
> The other was that you have to have every possible value of input domain
> represented as a row. When the domain are characters the mapping is
> manageable, for everything else - not so much. As a vague idea, if there
> was a way to condense the input domain through a verb, possibly a dyad to
> pass an additional state, it would considerably expand the use of ;: with
> some performance hit of course.
> Also the code utilizing ;: is pretty much unreadable even by J standards.
> That was my initial impressions about it.
>
>
> 2017-06-11 15:43 GMT+03:00 'Pascal Jasmin' via Programming <
> [email protected]>:
>
>> A more general procedure than your request is to cut your data such that
>> your start/end segments are in odd positions
>>
>>
>> in jpp, https://github.com/Pascal-J/jpp
>>
>> cutP is a process for cutting on start and end tokens, though there are
>> faster methods in included fsm.ijs file.  And that process could get
>> significant boost if ;: were enhanced to support emitting empty boxes, but:
>>
>> cutP '(asdf)g()'
>> ++----+-+++
>> ||asdf|g|||
>> ++----+-+++
>>
>> cutP is dyadic for start and end tokens other than '()'.
>>
>>
>> also from jpp, the AltM adverb takes a gerund to apply cyclically to such
>> an above cut structure.
>>
>>
>> a:"_`u AltM would produce empties for non-odd positions.
>>
>> But if you only care about the selections, then either regex, or a ;:
>> definition can extract them.
>>
>>
>> ________________________________
>> From: Danil Osipchuk <[email protected]>
>> To: Programming forum <[email protected]>
>> Sent: Sunday, June 11, 2017 7:19 AM
>> Subject: [Jprogramming] Apply at start/lengths pairs
>>
>>
>>
>> Hi all,
>>
>>
>> I wonder if there is an idiomatic way to apply a verb using an array of
>>
>> start and length pairs. This is a recurring pattern when extracting data
>>
>> from files.
>>
>> I've tried 3 adverbs (the example at the end), and the first one is
>>
>> slightly better on big files, but I'm still looking for possible
>>
>> improvements (the need is to extract selected fields from multi-gigabyte
>>
>> memory mapped csv/xml files)
>>
>>
>> 'ab' xmlTagContentSL XML
>>
>>
>> 11 1
>>
>>
>> 22 2
>>
>>
>> 34 3
>>
>>
>> 47 4
>>
>>
>> 'ab' <xmlTagDo XML
>>
>>
>> +-+--+---+----+
>>
>>
>> |1|20|300|4000|
>>
>>
>> +-+--+---+----+
>>
>>
>> (2 2 $ 'ab'xmlTagContentSL XML) <doSL XML
>>
>>
>> +---+----+
>>
>>
>> |1 |20 |
>>
>>
>> +---+----+
>>
>>
>> |300|4000|
>>
>>
>> +---+----+
>>
>>
>>
>> regards,
>>
>> Danil
>>
>>
>> doSL =: 1 : '(,."1@[)u;.0]' NB. SL stands for start len pair
>>
>>
>> NB. doSL =: 1 : '(0|:[:,:[)u;.0]'
>>
>>
>> NB. doSL =: 1 : '(u;.0~ ,.)~"1'
>>
>>
>>
>> xmlTagOpn =: '<' ,'>',~]
>>
>>
>> xmlTagCls =: '</','>',~]
>>
>>
>>
>> xmlTagContentSL =: 4 : 0
>>
>>
>> CS =. (xmlTagOpn >x) (#@[ + I.@E.) y
>>
>>
>> CE =. (xmlTagCls >x) I.@E. y
>>
>>
>> CS ,. CE-CS
>>
>>
>> )
>>
>>
>>
>> xmlTagDo =: 1 : '(xmlTagContentSL (u doSL) ])f.'
>>
>>
>>
>> XML =: 0 : 0
>>
>>
>> <data>
>>
>>
>> <ab>1</ab>
>>
>>
>> <ab>20</ab>
>>
>>
>> <ab>300</ab>
>>
>>
>> <ab>4000</ab>
>>
>>
>> </data>
>>
>>
>> )
>>
>> ----------------------------------------------------------------------
>>
>> For information about J forums see http://www.jsoftware.com/forums.htm
>> ----------------------------------------------------------------------
>> For information about J forums see http://www.jsoftware.com/forums.htm
>>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Apply at start/lengths pairs

Reply via email to