About performance: I just did a quick experiment in the pier migration application where I need split and join.
I use split and join to remove comments from HTMl files. I ran the tests without removing comments, and removing them using the two different split/join implementations. Keith's sequence splitter is blindingly fast, imposing no discernable overhead, whereas my regex version slows all the tests down by 100%! I would still like to have splitting on regexes, but it should probably not be the default for strings. Maybe we can improve the implementation and speed it up ... - on On Apr 5, 2009, at 18:03, Oscar Nierstrasz wrote: > > With Keith's version you can do this: > > #(1 10 11 2 10 11 3 10 11 4) splitOn: #(10 11) > > I was assuming that the thing we use to split was a regex string. > > 'hello there' split: '\s' > > Actually I see that Damien added this possibility in RubyShards as > well. This also works: > > #(1 10 11 2 10 11 3 10 11 4) split: #(10 11) > > It seems that RubyShards is more general, but we need to take a closer > look at both solutions. The interfaces are not the same. There may > be differences in performance. > > - on > > > On Apr 5, 2009, at 17:47, Stéphane Ducasse wrote: > >> I would be in favor to have a nice oo solution :) >> I do not know what means "uses a sequence to split a sequence." >> >> Stef >> >>> OK, I had a closer look. >>> >>> Keith's implementation is completely different from, and pre-dates, >>> that of Damien and myself. >>> >>> Keith's version works for SequenceableCollections, and uses a >>> sequence >>> to split a sequence. >>> >>> Ours is more tailored towards Strings, and uses a regex to split a >>> String. >>> >>> Perhaps we can consider a merge in which sequences can be split >>> using >>> sequences, and Strings can additionally be split using regexes. >>> >>> We should also take efficiency into account. I did not run any >>> benchmarks yet to compare the implementations >>> >>> Who is interested in merging these two? >>> >>> Cheers, >>> - on >>> >>> On Apr 5, 2009, at 16:25, Oscar Nierstrasz wrote: >>> >>>> >>>> Hi Keith, >>>> >>>> Now I see there are attached files in Mantis. But they all seem to >>>> date from 2006, whereas your latest comments are from Jan 2009. >>>> Are >>>> there more recent files from 2009 that I should look at? If so, >>>> where >>>> are they? >>>> >>>> What is the best way to proceed? Shall I create a Join project on >>>> SqueakSource, and if it is updated, post the latest version on >>>> Mantis >>>> too? >>>> >>>> Cheers, >>>> - on >>>> >>>> On Apr 5, 2009, at 16:08, Keith Hodges wrote: >>>> >>>>> Stéphane Ducasse wrote: >>>>>>> I wrote the split join implementation that is available on >>>>>>> mantis >>>>>>> >>>>>>> http://bugs.squeak.org/view.php?id=4874 >>>>>>> >>>>>>> I use it all the time, if you would like to improve on what is >>>>>>> there, please continue to contribute to the mantis page >>>>>>> discussion/ >>>>>>> tests and code. That way we will get an polished implementation >>>>>>> that >>>>>>> can be added to squeak or to pharo. >>>>>>> >>>>>>> The suggestion to use #species would be fine (I never use >>>>>>> species >>>>>>> myself, because I dont understand what its really for). >>>>>>> >>>>>> >>>>>> or class >>>>>> the point is that you get back a collection of the same kind of >>>>>> the >>>>>> receiver >>>>>> >>>>>>> When stef says "I have checked the code and it looks nice" he >>>>>>> didnt >>>>>>> say which code he checked, so I am confused. >>>>>>> >>>>>> >>>>>> I looked at the latest version in the repository mentioned by >>>>>> oscar >>>>>> rubyshards >>>>>> >>>>>> >>>>> Which appears to me to be the opposite of what Oscar suggested. >>>>> If I >>>>> read the email, he asked what the status of mantis 4874 was, >>>>> anticipating that it be integrated. He had "gone back" to ruby >>>>> shards in >>>>> the absense of the integration of 4784. >>>>> >>>>> Keith >>>>> >>>>> >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [email protected] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- >>>>> project >>>>> >>>> >>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [email protected] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [email protected] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> >> _______________________________________________ >> Pharo-project mailing list >> [email protected] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > > _______________________________________________ > Pharo-project mailing list > [email protected] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [email protected] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
