ok let us know when you have a solution ready for integration because this would be 8 methods well spend :)
On Apr 5, 2009, at 6:27 PM, Oscar Nierstrasz wrote: > > Oops. I made a mistake in the experiment. There is actually less > difference than I thought. > > Here we load a web site, optionally using split and join to remove all > comments. My regex version seems to be only marginally worse than > Keith's sequence splitting. > > > 5289 "ON split on regex" > 5327 > > 5165 "KH split on sequence" > 5160 > > 2153 "no splitting" > 2160 > > So regex splitting seems to be feasible. > > I can try to have a closer look and propose a merged solution, but > right now my plate is rather full. > > - on > > > On Apr 5, 2009, at 18:14, Oscar Nierstrasz wrote: > >> >> About performance: >> >> I just did a quick experiment in the pier migration application where >> I need split and join. >> >> I use split and join to remove comments from HTMl files. I ran the >> tests without removing comments, and removing them using the two >> different split/join implementations. >> >> Keith's sequence splitter is blindingly fast, imposing no discernable >> overhead, whereas my regex version slows all the tests down by 100%! >> >> I would still like to have splitting on regexes, but it should >> probably not be the default for strings. Maybe we can improve the >> implementation and speed it up ... >> >> - on >> >> On Apr 5, 2009, at 18:03, Oscar Nierstrasz wrote: >> >>> >>> With Keith's version you can do this: >>> >>> #(1 10 11 2 10 11 3 10 11 4) splitOn: #(10 11) >>> >>> I was assuming that the thing we use to split was a regex string. >>> >>> 'hello there' split: '\s' >>> >>> Actually I see that Damien added this possibility in RubyShards as >>> well. This also works: >>> >>> #(1 10 11 2 10 11 3 10 11 4) split: #(10 11) >>> >>> It seems that RubyShards is more general, but we need to take a >>> closer >>> look at both solutions. The interfaces are not the same. There may >>> be differences in performance. >>> >>> - on >>> >>> >>> On Apr 5, 2009, at 17:47, Stéphane Ducasse wrote: >>> >>>> I would be in favor to have a nice oo solution :) >>>> I do not know what means "uses a sequence to split a sequence." >>>> >>>> Stef >>>> >>>>> OK, I had a closer look. >>>>> >>>>> Keith's implementation is completely different from, and pre- >>>>> dates, >>>>> that of Damien and myself. >>>>> >>>>> Keith's version works for SequenceableCollections, and uses a >>>>> sequence >>>>> to split a sequence. >>>>> >>>>> Ours is more tailored towards Strings, and uses a regex to split a >>>>> String. >>>>> >>>>> Perhaps we can consider a merge in which sequences can be split >>>>> using >>>>> sequences, and Strings can additionally be split using regexes. >>>>> >>>>> We should also take efficiency into account. I did not run any >>>>> benchmarks yet to compare the implementations >>>>> >>>>> Who is interested in merging these two? >>>>> >>>>> Cheers, >>>>> - on >>>>> >>>>> On Apr 5, 2009, at 16:25, Oscar Nierstrasz wrote: >>>>> >>>>>> >>>>>> Hi Keith, >>>>>> >>>>>> Now I see there are attached files in Mantis. But they all seem >>>>>> to >>>>>> date from 2006, whereas your latest comments are from Jan 2009. >>>>>> Are >>>>>> there more recent files from 2009 that I should look at? If so, >>>>>> where >>>>>> are they? >>>>>> >>>>>> What is the best way to proceed? Shall I create a Join project >>>>>> on >>>>>> SqueakSource, and if it is updated, post the latest version on >>>>>> Mantis >>>>>> too? >>>>>> >>>>>> Cheers, >>>>>> - on >>>>>> >>>>>> On Apr 5, 2009, at 16:08, Keith Hodges wrote: >>>>>> >>>>>>> Stéphane Ducasse wrote: >>>>>>>>> I wrote the split join implementation that is available on >>>>>>>>> mantis >>>>>>>>> >>>>>>>>> http://bugs.squeak.org/view.php?id=4874 >>>>>>>>> >>>>>>>>> I use it all the time, if you would like to improve on what is >>>>>>>>> there, please continue to contribute to the mantis page >>>>>>>>> discussion/ >>>>>>>>> tests and code. That way we will get an polished >>>>>>>>> implementation >>>>>>>>> that >>>>>>>>> can be added to squeak or to pharo. >>>>>>>>> >>>>>>>>> The suggestion to use #species would be fine (I never use >>>>>>>>> species >>>>>>>>> myself, because I dont understand what its really for). >>>>>>>>> >>>>>>>> >>>>>>>> or class >>>>>>>> the point is that you get back a collection of the same kind of >>>>>>>> the >>>>>>>> receiver >>>>>>>> >>>>>>>>> When stef says "I have checked the code and it looks nice" he >>>>>>>>> didnt >>>>>>>>> say which code he checked, so I am confused. >>>>>>>>> >>>>>>>> >>>>>>>> I looked at the latest version in the repository mentioned by >>>>>>>> oscar >>>>>>>> rubyshards >>>>>>>> >>>>>>>> >>>>>>> Which appears to me to be the opposite of what Oscar suggested. >>>>>>> If I >>>>>>> read the email, he asked what the status of mantis 4874 was, >>>>>>> anticipating that it be integrated. He had "gone back" to ruby >>>>>>> shards in >>>>>>> the absense of the integration of 4784. >>>>>>> >>>>>>> Keith >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pharo-project mailing list >>>>>>> [email protected] >>>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- >>>>>>> project >>>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Pharo-project mailing list >>>>>> [email protected] >>>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- >>>>>> project >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Pharo-project mailing list >>>>> [email protected] >>>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo- >>>>> project >>>>> >>>> >>>> >>>> _______________________________________________ >>>> Pharo-project mailing list >>>> [email protected] >>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>>> >>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [email protected] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >> >> >> _______________________________________________ >> Pharo-project mailing list >> [email protected] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > Kind regards, > Oscar Nierstrasz > --- > Prof. Dr. O. Nierstrasz -- [email protected] > Software Composition Group -- http://www.iam.unibe.ch/~scg > University of Bern -- Tel/Fax +41 31 631.4618/3355 > vcard: http://www.iam.unibe.ch/~oscar/oscarNierstrasz.vcf > > > _______________________________________________ > Pharo-project mailing list > [email protected] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [email protected] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
