All - in the discussion about possibly broadening the equality test for 
curly-infix operators, I looked at Scheme R7RS and ended up with some concerns 
about their mandates for cycle detection.  These issues might affect us, too.

Below is a draft email I'm thinking of sending to scheme-reports at 
scheme-reports.org (subject line "Cycle detection problems: #442, #338, 
"equal?").  Any comments?  In particular, am I missing something?

--- David A. Wheeler

====================================

I think the latest draft 6 ballot resolutions 
(http://trac.sacrideo.us/wg/wiki/WG1Ballot6Results) on circular data structures 
still have problems.

First, trac #338 has the display procedure "generate loops like write", but the 
later resolution in #442 means that the resolution of #338 started with a false 
premise.  After all, the write procedure no longer generates "loops like 
write", as was assumed by #338 vote.  So #338 needs revisiting.

Second, trac #442's resolution imposes a large overhead that is completely 
unnecessary to meet the stated goal (termination).  The comments in #442 make 
it clear that for ordinary "write", the primary thing some people cared about 
was termination in the presence of cycles.

But if that were the goal, I would have expected the specification to permit 
far more efficient implementations (of "equal?", "write", and "display").  For 
example, I would have expected the spec to permit an approach I'll call the 
"suspicious counter" implementation:
* Run equal?/write/display for a while, and recurse without tracking cycles. 
Instead, have a running counter of the number of cons/vectors that have been 
iterated so far.  The running counter can be handled with tail recursion, so 
this counting would be very efficient.
* If the counter exceeds some "suspicious" value, THEN start tracking to see if 
we have a cycle (this tracking might be done with a hashtable).

This would still impose a significant overhead, because ANY cycle-detection 
system imposes a massive overhead, but on average its overhead would be FAR 
smaller than the one currently mandated by #442 and #338 in most cases.

However, the resolution of #442 (write+simple+shared) seems to forbid a 
"suspicious counter" implementation.  The current resolution of #442 still 
requires labels when there's a cycle, even though it seems unlikely that the 
user actually wants (or cares) about labels. If the user wanted labels, the 
user would have called write-shared.  I recommend that write-shared be used 
when you actually want labels, and if the purpose is just to ensure termination 
of write and display with cycles, then just require that they terminate in the 
presence of cycles and say *nothing* *else*.  In particular, don't require 
labelling; the spec should allow write to print some "extra" data as long as 
termination detection occurs *eventually*.  This would make "write" and 
"display" more consistent with "equal?", since "equal?" already only requires 
termination in the presence of cycles.

In addition, there seems to be a major gap in the definition of "equal?".  It 
currently says, "Even if its arguments are circular data structures, equal? 
must always terminate."  That additional sentence adds a MAJOR new performance 
and implementation cost on "equal?" compared to R5RS.  But what, exactly, is 
that procedure supposed to do if a cycle is detected?  Return #f?  Throw 
something?  Either?  This additional requirement imposes a major new 
performance cost, but the spec doesn't even promise to do something if it 
happens.  That's a costly effort to produce an undefined result.  (My 
preference would be #f, if "equal?" *has* to have cycle detection, but that's 
not my main point.)

I think redefining these standard functions, which *explicitly* did not 
guarantee termination in the presence of cycles in R5RS and before, is a bad 
idea.  I understand the rationale.  But cycles are exceedingly rare, and 
detecting them has a significant overhead.  You're setting up a situation where 
NON-compliant Scheme will be demonstrably faster than compliant ones in the 
normal case, because there is no standard way to request the traditional 
semantics (WITHOUT cycle detection).  In particular, cycle detection would 
typically be done by a hash table (not in the standard!) or via lists 
(sloooow), both of which impose creating and growing data structures, etc.  Why 
impose such a big overhead on "normal" procedures for an extremely unusual 
case, especially since Scheme for decades has *specifically* made clear that 
users couldn't count on cycle detection in these procedures?  You don't need to 
be Neo to use traditional write and equal?.  Scheme already has many ways to 
get into infinite loops; this is an expensive way to prevent an odd case while 
not really providing additional safety.

Don't get me wrong, I think cycle detection is valuable. But these 
cycle-detection procedures have fundamentally different same space and time 
properties from the R5RS procedures.  New procedures, with different semantics 
and impacts, should have new names!  Those new procedures should have the 
semantics above, where they ONLY are guaranteed to terminate in the presence of 
cycles and where the result of "equal?" in cycles is actually defined.

In any case, even if you change equal?/write/display to do cycle detection, 
compared to R5RS, I think it is *critical* that the spec merely guarantees that 
write/display will be eventual termination in the presence of cycles.  Please 
do NOT try to print labels and perform other complex tasks when the user didn't 
request them.  And similarly, what exactly equal? will do if given a cycle 
needs to be clear, or users can't depend on it.

In general, I like how R7RS is shaping up; this seems to be a sore spot in an 
otherwise really promising spec.  I especially can't wait for a library and 
exception system that might actually *PORT* across Schemes.

Please re-examine #338, #442, and "equal?" in this light.  Thanks.



--- David A. Wheeler

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Readable-discuss mailing list
Readable-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/readable-discuss

Reply via email to