I could not find a cutP definition with a quick look, but from your example
it seems like you mean a character separator by token. It is not general
enough.

Also, imagine a fluffy xml file, with millions of records, where only a
minority of fields of different type in records are interesting, some nodes
have missing fields of the type you are interested in.
Parsing the whole file is plainly unfeasible because of performance and
complexity of the resulting code.

Applying at selected positions obtained and reshaped by whatever means
however works rather well and is easy to reason about.

As about fsm, I do remember that I found ;: inconvenient when I was trying
to apply it - and one issue was that there is no way to emit an empty word.
The other was that you have to have every possible value of input domain
represented as a row. When the domain are characters the mapping is
manageable, for everything else - not so much. As a vague idea, if there
was a way to condense the input domain through a verb, possibly a dyad to
pass an additional state, it would considerably expand the use of ;: with
some performance hit of course.
Also the code utilizing ;: is pretty much unreadable even by J standards.
That was my initial impressions about it.


2017-06-11 15:43 GMT+03:00 'Pascal Jasmin' via Programming <
[email protected]>:

> A more general procedure than your request is to cut your data such that
> your start/end segments are in odd positions
>
>
> in jpp, https://github.com/Pascal-J/jpp
>
> cutP is a process for cutting on start and end tokens, though there are
> faster methods in included fsm.ijs file.  And that process could get
> significant boost if ;: were enhanced to support emitting empty boxes, but:
>
> cutP '(asdf)g()'
> ++----+-+++
> ||asdf|g|||
> ++----+-+++
>
> cutP is dyadic for start and end tokens other than '()'.
>
>
> also from jpp, the AltM adverb takes a gerund to apply cyclically to such
> an above cut structure.
>
>
> a:"_`u AltM would produce empties for non-odd positions.
>
> But if you only care about the selections, then either regex, or a ;:
> definition can extract them.
>
>
> ________________________________
> From: Danil Osipchuk <[email protected]>
> To: Programming forum <[email protected]>
> Sent: Sunday, June 11, 2017 7:19 AM
> Subject: [Jprogramming] Apply at start/lengths pairs
>
>
>
> Hi all,
>
>
> I wonder if there is an idiomatic way to apply a verb using an array of
>
> start and length pairs. This is a recurring pattern when extracting data
>
> from files.
>
> I've tried 3 adverbs (the example at the end), and the first one is
>
> slightly better on big files, but I'm still looking for possible
>
> improvements (the need is to extract selected fields from multi-gigabyte
>
> memory mapped csv/xml files)
>
>
> 'ab' xmlTagContentSL XML
>
>
> 11 1
>
>
> 22 2
>
>
> 34 3
>
>
> 47 4
>
>
> 'ab' <xmlTagDo XML
>
>
> +-+--+---+----+
>
>
> |1|20|300|4000|
>
>
> +-+--+---+----+
>
>
> (2 2 $ 'ab'xmlTagContentSL XML) <doSL XML
>
>
> +---+----+
>
>
> |1 |20 |
>
>
> +---+----+
>
>
> |300|4000|
>
>
> +---+----+
>
>
>
> regards,
>
> Danil
>
>
> doSL =: 1 : '(,."1@[)u;.0]' NB. SL stands for start len pair
>
>
> NB. doSL =: 1 : '(0|:[:,:[)u;.0]'
>
>
> NB. doSL =: 1 : '(u;.0~ ,.)~"1'
>
>
>
> xmlTagOpn =: '<' ,'>',~]
>
>
> xmlTagCls =: '</','>',~]
>
>
>
> xmlTagContentSL =: 4 : 0
>
>
> CS =. (xmlTagOpn >x) (#@[ + I.@E.) y
>
>
> CE =. (xmlTagCls >x) I.@E. y
>
>
> CS ,. CE-CS
>
>
> )
>
>
>
> xmlTagDo =: 1 : '(xmlTagContentSL (u doSL) ])f.'
>
>
>
> XML =: 0 : 0
>
>
> <data>
>
>
> <ab>1</ab>
>
>
> <ab>20</ab>
>
>
> <ab>300</ab>
>
>
> <ab>4000</ab>
>
>
> </data>
>
>
> )
>
> ----------------------------------------------------------------------
>
> For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to