Your doSL is roughly what I would expect to use for this kind of thing. (I think I would have phrased it
doSL=: 1 :'(u;.0~ , $~ ,&1@$)~' but I don't know if that's any better, performance wise.) It's possible that the interpreter could be improved here... Good luck, -- Raul On Sun, Jun 11, 2017 at 7:19 AM, Danil Osipchuk <[email protected]> wrote: > Hi all, > > I wonder if there is an idiomatic way to apply a verb using an array of > start and length pairs. This is a recurring pattern when extracting data > from files. > I've tried 3 adverbs (the example at the end), and the first one is > slightly better on big files, but I'm still looking for possible > improvements (the need is to extract selected fields from multi-gigabyte > memory mapped csv/xml files) > > 'ab' xmlTagContentSL XML > > 11 1 > > 22 2 > > 34 3 > > 47 4 > > 'ab' <xmlTagDo XML > > +-+--+---+----+ > > |1|20|300|4000| > > +-+--+---+----+ > > (2 2 $ 'ab'xmlTagContentSL XML) <doSL XML > > +---+----+ > > |1 |20 | > > +---+----+ > > |300|4000| > > +---+----+ > > > regards, > Danil > > doSL =: 1 : '(,."1@[)u;.0]' NB. SL stands for start len pair > > NB. doSL =: 1 : '(0|:[:,:[)u;.0]' > > NB. doSL =: 1 : '(u;.0~ ,.)~"1' > > > xmlTagOpn =: '<' ,'>',~] > > xmlTagCls =: '</','>',~] > > > xmlTagContentSL =: 4 : 0 > > CS =. (xmlTagOpn >x) (#@[ + I.@E.) y > > CE =. (xmlTagCls >x) I.@E. y > > CS ,. CE-CS > > ) > > > xmlTagDo =: 1 : '(xmlTagContentSL (u doSL) ])f.' > > > XML =: 0 : 0 > > <data> > > <ab>1</ab> > > <ab>20</ab> > > <ab>300</ab> > > <ab>4000</ab> > > </data> > > ) > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
