Re: std.algorithm.splitter on a string not always bidirectional

Jon Degenhardt via Digitalmars-d-learn Fri, 22 Jan 2021 09:00:39 -0800

On Friday, 22 January 2021 at 14:14:50 UTC, Steven Schveighofferwrote:

On 1/22/21 12:55 AM, Jon Degenhardt wrote:
On Friday, 22 January 2021 at 05:51:38 UTC, Jon Degenhardtwrote:
On Thursday, 21 January 2021 at 22:43:37 UTC, StevenSchveighoffer wrote:
auto sp1 = "a|b|c".splitter('|');
writeln(sp1.back); // ok

auto sp2 = "a.b|c".splitter!(v => !isAlphaNum(v));

writeln(sp2.back); // error, not bidirectional

Why? is it an oversight, or is there a good reason for it?
I believe the reason is two-fold. First, splitter is lazy.Second, the range splitting is defined in the forwarddirection, not the reverse direction. A bidirectional rangeis only supported if it is guaranteed that the splits willoccur at the same points in the range when run in eitherdirection. That's why the single element delimiter issupported. Its clearly the case for the predicate function inyour example. If that's known to be always true then perhapsit would make sense to enhance splitter to generatebidirectional results in this case.
Note that the predicate might use a random number generator topick the split points. Even for same sequence of randomnumbers, the split points would be different if run from thefront than if run from the back.
I think this isn't a good explanation.
All forms of splitter accept a predicate (including the onewhich supports a bi-directional result). Many other phobosalgorithms that accept a predicate provide bidirectionalsupport. The splitter result is also a forward range (whichmakes no sense in the context of random splits).
Finally, I'd suggest that even if you split based on a subrangethat is also bidirectional, it doesn't make sense that youcouldn't split backwards based on that. Common sense says arange split on substrings is the same whether you split itforwards or backwards.
I can do this too (and in fact I will, because it works, eventhough it's horrifically ugly):
auto sp3 = "a.b|c".splitter!((c, unused) =>!isAlphaNum(c))('?');
writeln(sp3.back); // ok
Looking at the code, it looks like the first form of spltteruses a different result struct than the other two (which have acommon implementation). It just needs cleanup.
-Steve

I think the idea is that if a construct like 'xyz.splitter(args)'produces a range with the sequence of elements {"a", "bc","def"}, then 'xyz.splitter(args).back' should produce "def". But,if finding the split points starting from the back results insomething like {"f", "de", "abc"} then that relationship hasn'theld, and the results are unexpected.

Note that in the above example, 'xyz.retro.splitter(args)' mightproduce {"f", "ed", "cba"}, so again not the same.

Another way to look at it: If split (eager) took a predicate,that 'xyz.splitter(args).back' and 'xyz.split(args).back' shouldproduce the same result. But they will not with the example given.

I believe these consistency issues are the reason why thebidirectional support is limited.

Note: I didn't design any of this, but I did redo the examples inthe documentation at one point, which is why I looked at this.


--Jon

Re: std.algorithm.splitter on a string not always bidirectional

Reply via email to