On Sep 30, 12:49 pm, Michael Moore <michaeljmo...@gmail.com> wrote: > Here is a better solution created by my friend Shaun Batterton. > > SQL> with one as ( select 'This is it' x from dual), > > 2 two as ( select 'ThiS is not' x from dual) > > 3 select sum(decode(substr(one.x, level, 1) , substr(two.x, level, 1), 0, > 1)) cnt_diff > > 4 from one,two > > 5 connect by level<= greatest(length(one.x), length(two.x)); >
It's wrong: SQL> with one as ( select 'This is it' x from dual), 2 two as ( select 'ThiS is not' x from dual) 3 select sum(decode(substr(one.x, level, 1) , substr(two.x, level, 1), 0, 1)) cnt_diff 4 from one,two 5 connect by level<= greatest(length(one.x), length(two.x)); CNT_DIFF ---------- 4 I count 3 differences between those two strings, not 4 (the capital S and the letters 'no'). The fourth 'mismatch' is due to how this plays out: SQL> with one as ( select 'This is it' x from dual), 2 two as ( select 'ThiS is not' x from dual) 3 select substr(one.x, level, 1) , substr(two.x, level, 1) 4 from one,two 5 connect by level<= greatest(length(one.x), length(two.x)); SUBS SUBS ---- ---- T T h h i i s S <=== correctly identified difference i i s s i n <=== correctly identified difference t o <=== incorrectly identified differences as 'no' differs from 'i' but both 't' characters should match t <=== as they are both at the end of a line 11 rows selected. The calculation of differences should stop at the end of the line and this method extends beyond the end of the shorter line in favor of the length of the longer line. The shorter line length should 'prevail' in my opinion to produce the correct result: SQL> with one as ( select 'This is it' x from dual), 2 two as ( select 'ThiS is not' x from dual) 3 select sum(decode(substr(one.x, level, 1) , substr(two.x, level, 1), 0, 1)) cnt_diff 4 from one,two 5 connect by level<= least(length(one.x), length(two.x)); CNT_DIFF ---------- 3 But it's a slippery slope, indeed, when one takes shortcuts: SQL> with one as ( select 'This ain''t it' x from dual), 2 two as ( select 'ThiS ain''t it neither' x from dual) 3 select sum(decode(substr(one.x, level, 1) , substr(two.x, level, 1), 0, 1)) cnt_diff 4 from one,two 5 connect by level<= least(length(one.x), length(two.x)); CNT_DIFF ---------- 1 This is wrong for obvious reasons. The original logic works for this set of strings: SQL> with one as ( select 'This ain''t it' x from dual), 2 two as ( select 'ThiS ain''t it neither' x from dual) 3 select sum(decode(substr(one.x, level, 1) , substr(two.x, level, 1), 0, 1)) cnt_diff 4 from one,two 5 connect by level<= greatest(length(one.x), length(two.x)); CNT_DIFF ---------- 9 However it doesn't work for all strings. I believee it needs to be a bit more involved to get it to work for most (if not all) strings: SQL> declare 2 one varchar2(40):='This is it'; 3 two varchar2(40):='ThiS is not'; 4 5 currpos number:=1; 6 7 chars_diff varchar2(40); 8 9 num_diff number; 10 11 l_one number:= length(one); 12 l_two number:= length(two); 13 14 begin 15 if l_two - l_one > 0 then 16 num_diff := l_two - l_one; 17 while substr(one, currpos, 1) is not null loop 18 if substr(one, currpos,1) <> substr(two, currpos, 1) then 19 chars_diff := chars_diff||substr(two, currpos, 1); 20 num_diff := num_diff+1; 21 currpos := currpos+1; 22 else 23 currpos := currpos+1; 24 end if; 25 end loop; 26 else 27 num_diff := l_one - l_two; 28 while substr(two, currpos, 1) is not null loop 29 if substr(two, currpos,1) <> substr(one, currpos, 1) then 30 chars_diff := chars_diff||substr(one, currpos, 1); 31 num_diff := num_diff+1; 32 currpos := currpos+1; 33 else 34 currpos := currpos+1; 35 end if; 36 end loop; 37 end if; 38 39 if l_two > l_one and substr(one, length(one), 1) != substr(two, length(two), 1) then 40 chars_diff := chars_diff||substr(two, l_one+1); 41 else 42 chars_diff := chars_diff||substr(two, l_one+1, l_two - l_one - 1); 43 end if; 44 if substr(one, length(one), 1) = substr(two, length(two), 1) then 45 num_diff := num_diff - 1; 46 end if; 47 48 49 50 dbms_output.put_line('Characters which are different: '|| chars_diff||' Diff count: '||num_diff); 51 52 end; 53 / Characters which are different: Sno Diff count: 3 PL/SQL procedure successfully completed. SQL> declare 2 one varchar2(40):='This ain''t it'; 3 two varchar2(40):='ThiS ain''t it neither'; 4 5 currpos number:=1; 6 7 chars_diff varchar2(40); 8 9 num_diff number; 10 11 l_one number:= length(one); 12 l_two number:= length(two); 13 14 begin 15 if l_two - l_one > 0 then 16 num_diff := l_two - l_one; 17 while substr(one, currpos, 1) is not null loop 18 if substr(one, currpos,1) <> substr(two, currpos, 1) then 19 chars_diff := chars_diff||substr(two, currpos, 1); 20 num_diff := num_diff+1; 21 currpos := currpos+1; 22 else 23 currpos := currpos+1; 24 end if; 25 end loop; 26 else 27 num_diff := l_one - l_two; 28 while substr(two, currpos, 1) is not null loop 29 if substr(two, currpos,1) <> substr(one, currpos, 1) then 30 chars_diff := chars_diff||substr(one, currpos, 1); 31 num_diff := num_diff+1; 32 currpos := currpos+1; 33 else 34 currpos := currpos+1; 35 end if; 36 end loop; 37 end if; 38 39 if l_two > l_one and substr(one, length(one), 1) != substr(two, length(two), 1) then 40 chars_diff := chars_diff||substr(two, l_one+1); 41 else 42 chars_diff := chars_diff||substr(two, l_one+1, l_two - l_one - 1); 43 end if; 44 if substr(one, length(one), 1) = substr(two, length(two), 1) then 45 num_diff := num_diff - 1; 46 end if; 47 48 49 50 dbms_output.put_line('Characters which are different: '|| chars_diff||' Diff count: '||num_diff); 51 52 end; 53 / Characters which are different: S neither Diff count: 9 PL/SQL procedure successfully completed. SQL> David Fitzjarrell -- You received this message because you are subscribed to the Google Groups "Oracle PL/SQL" group. To post to this group, send email to Oracle-PLSQL@googlegroups.com To unsubscribe from this group, send email to oracle-plsql-unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/Oracle-PLSQL?hl=en