I believe readability is the reason ;: was not made more general. Though efficiency might also have been a part of that.
-- Raul On Sun, Jun 11, 2017 at 9:59 AM, Danil Osipchuk <[email protected]> wrote: > I could not find a cutP definition with a quick look, but from your example > it seems like you mean a character separator by token. It is not general > enough. > > Also, imagine a fluffy xml file, with millions of records, where only a > minority of fields of different type in records are interesting, some nodes > have missing fields of the type you are interested in. > Parsing the whole file is plainly unfeasible because of performance and > complexity of the resulting code. > > Applying at selected positions obtained and reshaped by whatever means > however works rather well and is easy to reason about. > > As about fsm, I do remember that I found ;: inconvenient when I was trying > to apply it - and one issue was that there is no way to emit an empty word. > The other was that you have to have every possible value of input domain > represented as a row. When the domain are characters the mapping is > manageable, for everything else - not so much. As a vague idea, if there > was a way to condense the input domain through a verb, possibly a dyad to > pass an additional state, it would considerably expand the use of ;: with > some performance hit of course. > Also the code utilizing ;: is pretty much unreadable even by J standards. > That was my initial impressions about it. > > > 2017-06-11 15:43 GMT+03:00 'Pascal Jasmin' via Programming < > [email protected]>: > >> A more general procedure than your request is to cut your data such that >> your start/end segments are in odd positions >> >> >> in jpp, https://github.com/Pascal-J/jpp >> >> cutP is a process for cutting on start and end tokens, though there are >> faster methods in included fsm.ijs file. And that process could get >> significant boost if ;: were enhanced to support emitting empty boxes, but: >> >> cutP '(asdf)g()' >> ++----+-+++ >> ||asdf|g||| >> ++----+-+++ >> >> cutP is dyadic for start and end tokens other than '()'. >> >> >> also from jpp, the AltM adverb takes a gerund to apply cyclically to such >> an above cut structure. >> >> >> a:"_`u AltM would produce empties for non-odd positions. >> >> But if you only care about the selections, then either regex, or a ;: >> definition can extract them. >> >> >> ________________________________ >> From: Danil Osipchuk <[email protected]> >> To: Programming forum <[email protected]> >> Sent: Sunday, June 11, 2017 7:19 AM >> Subject: [Jprogramming] Apply at start/lengths pairs >> >> >> >> Hi all, >> >> >> I wonder if there is an idiomatic way to apply a verb using an array of >> >> start and length pairs. This is a recurring pattern when extracting data >> >> from files. >> >> I've tried 3 adverbs (the example at the end), and the first one is >> >> slightly better on big files, but I'm still looking for possible >> >> improvements (the need is to extract selected fields from multi-gigabyte >> >> memory mapped csv/xml files) >> >> >> 'ab' xmlTagContentSL XML >> >> >> 11 1 >> >> >> 22 2 >> >> >> 34 3 >> >> >> 47 4 >> >> >> 'ab' <xmlTagDo XML >> >> >> +-+--+---+----+ >> >> >> |1|20|300|4000| >> >> >> +-+--+---+----+ >> >> >> (2 2 $ 'ab'xmlTagContentSL XML) <doSL XML >> >> >> +---+----+ >> >> >> |1 |20 | >> >> >> +---+----+ >> >> >> |300|4000| >> >> >> +---+----+ >> >> >> >> regards, >> >> Danil >> >> >> doSL =: 1 : '(,."1@[)u;.0]' NB. SL stands for start len pair >> >> >> NB. doSL =: 1 : '(0|:[:,:[)u;.0]' >> >> >> NB. doSL =: 1 : '(u;.0~ ,.)~"1' >> >> >> >> xmlTagOpn =: '<' ,'>',~] >> >> >> xmlTagCls =: '</','>',~] >> >> >> >> xmlTagContentSL =: 4 : 0 >> >> >> CS =. (xmlTagOpn >x) (#@[ + I.@E.) y >> >> >> CE =. (xmlTagCls >x) I.@E. y >> >> >> CS ,. CE-CS >> >> >> ) >> >> >> >> xmlTagDo =: 1 : '(xmlTagContentSL (u doSL) ])f.' >> >> >> >> XML =: 0 : 0 >> >> >> <data> >> >> >> <ab>1</ab> >> >> >> <ab>20</ab> >> >> >> <ab>300</ab> >> >> >> <ab>4000</ab> >> >> >> </data> >> >> >> ) >> >> ---------------------------------------------------------------------- >> >> For information about J forums see http://www.jsoftware.com/forums.htm >> ---------------------------------------------------------------------- >> For information about J forums see http://www.jsoftware.com/forums.htm >> > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
