On Sep 30, 12:49 pm, Michael Moore <michaeljmo...@gmail.com> wrote:
> Here is a better solution created by my friend Shaun Batterton.
>
> SQL> with one as ( select 'This is it' x from dual),
>
>   2       two as ( select 'ThiS is not' x from dual)
>
>   3  select sum(decode(substr(one.x, level, 1) , substr(two.x, level, 1), 0,
> 1)) cnt_diff
>
>   4  from one,two
>
>   5  connect by level<= greatest(length(one.x), length(two.x));
>

It's wrong:

SQL> with one as ( select 'This is it' x from dual),
  2       two as ( select 'ThiS is not' x from dual)
  3  select sum(decode(substr(one.x, level, 1) , substr(two.x, level,
1), 0, 1)) cnt_diff
  4  from one,two
  5  connect by level<= greatest(length(one.x), length(two.x));

  CNT_DIFF
----------
         4

I count 3 differences between those two strings, not 4 (the capital S
and the letters 'no').  The fourth 'mismatch' is due to how this plays
out:

SQL> with one as ( select 'This is it' x from dual),
  2       two as ( select 'ThiS is not' x from dual)
  3  select substr(one.x, level, 1) , substr(two.x, level, 1)
  4  from one,two
  5  connect by level<= greatest(length(one.x), length(two.x));

SUBS SUBS
---- ----
T    T
h    h
i    i
s    S           <=== correctly identified difference

i    i
s    s

i    n           <=== correctly identified difference
t    o           <=== incorrectly identified differences as 'no'
differs from 'i' but both 't' characters should match
     t           <=== as they are both at the end of a line

11 rows selected.

The calculation of differences should stop at the end of the line and
this method extends beyond the end of the shorter line in favor of the
length of the longer line. The shorter line length should 'prevail' in
my opinion to produce the correct result:

SQL> with one as ( select 'This is it' x from dual),
  2       two as ( select 'ThiS is not' x from dual)
  3  select sum(decode(substr(one.x, level, 1) , substr(two.x, level,
1), 0, 1)) cnt_diff
  4  from one,two
  5  connect by level<= least(length(one.x), length(two.x));

  CNT_DIFF
----------
         3

But it's a slippery slope, indeed, when one takes shortcuts:

SQL> with one as ( select 'This ain''t it' x from dual),
  2       two as ( select 'ThiS ain''t it neither' x from dual)
  3  select sum(decode(substr(one.x, level, 1) , substr(two.x, level,
1), 0, 1)) cnt_diff
  4  from one,two
  5  connect by level<= least(length(one.x), length(two.x));

  CNT_DIFF
----------
         1

This is wrong for obvious reasons.  The original logic works for this
set of strings:

SQL> with one as ( select 'This ain''t it' x from dual),
  2       two as ( select 'ThiS ain''t it neither' x from dual)
  3  select sum(decode(substr(one.x, level, 1) , substr(two.x, level,
1), 0, 1)) cnt_diff
  4  from one,two
  5  connect by level<= greatest(length(one.x), length(two.x));

  CNT_DIFF
----------
         9

However  it doesn't work for all strings.  I believee it needs to be a
bit more involved to get it to work for most (if not all) strings:

SQL> declare
  2     one varchar2(40):='This is it';
  3     two varchar2(40):='ThiS is not';
  4
  5     currpos number:=1;
  6
  7     chars_diff varchar2(40);
  8
  9     num_diff number;
 10
 11     l_one number:= length(one);
 12     l_two number:= length(two);
 13
 14  begin
 15     if l_two - l_one > 0 then
 16             num_diff := l_two - l_one;
 17             while substr(one, currpos, 1) is not null loop
 18             if substr(one, currpos,1) <> substr(two, currpos, 1)
then
 19                     chars_diff := chars_diff||substr(two, currpos,
1);
 20                     num_diff := num_diff+1;
 21                     currpos := currpos+1;
 22             else
 23                     currpos := currpos+1;
 24             end if;
 25             end loop;
 26     else
 27             num_diff := l_one - l_two;
 28             while substr(two, currpos, 1) is not null loop
 29             if substr(two, currpos,1) <> substr(one, currpos, 1)
then
 30                     chars_diff := chars_diff||substr(one, currpos,
1);
 31                     num_diff := num_diff+1;
 32                     currpos := currpos+1;
 33             else
 34                     currpos := currpos+1;
 35             end if;
 36             end loop;
 37     end if;
 38
 39     if l_two > l_one and substr(one, length(one), 1) !=
substr(two, length(two), 1) then
 40             chars_diff := chars_diff||substr(two, l_one+1);
 41     else
 42             chars_diff := chars_diff||substr(two, l_one+1, l_two -
l_one - 1);
 43     end if;
 44     if substr(one, length(one), 1) = substr(two, length(two), 1)
then
 45             num_diff := num_diff - 1;
 46     end if;
 47
 48
 49
 50     dbms_output.put_line('Characters which are different: '||
chars_diff||'   Diff count: '||num_diff);
 51
 52  end;
 53  /
Characters which are different: Sno   Diff count: 3

PL/SQL procedure successfully completed.

SQL> declare
  2     one varchar2(40):='This ain''t it';
  3     two varchar2(40):='ThiS ain''t it neither';
  4
  5     currpos number:=1;
  6
  7     chars_diff varchar2(40);
  8
  9     num_diff number;
 10
 11     l_one number:= length(one);
 12     l_two number:= length(two);
 13
 14  begin
 15     if l_two - l_one > 0 then
 16             num_diff := l_two - l_one;
 17             while substr(one, currpos, 1) is not null loop
 18             if substr(one, currpos,1) <> substr(two, currpos, 1)
then
 19                     chars_diff := chars_diff||substr(two, currpos,
1);
 20                     num_diff := num_diff+1;
 21                     currpos := currpos+1;
 22             else
 23                     currpos := currpos+1;
 24             end if;
 25             end loop;
 26     else
 27             num_diff := l_one - l_two;
 28             while substr(two, currpos, 1) is not null loop
 29             if substr(two, currpos,1) <> substr(one, currpos, 1)
then
 30                     chars_diff := chars_diff||substr(one, currpos,
1);
 31                     num_diff := num_diff+1;
 32                     currpos := currpos+1;
 33             else
 34                     currpos := currpos+1;
 35             end if;
 36             end loop;
 37     end if;
 38
 39     if l_two > l_one and substr(one, length(one), 1) !=
substr(two, length(two), 1) then
 40             chars_diff := chars_diff||substr(two, l_one+1);
 41     else
 42             chars_diff := chars_diff||substr(two, l_one+1, l_two -
l_one - 1);
 43     end if;
 44     if substr(one, length(one), 1) = substr(two, length(two), 1)
then
 45             num_diff := num_diff - 1;
 46     end if;
 47
 48
 49
 50     dbms_output.put_line('Characters which are different: '||
chars_diff||'   Diff count: '||num_diff);
 51
 52  end;
 53  /
Characters which are different: S neither   Diff count: 9

PL/SQL procedure successfully completed.

SQL>


David Fitzjarrell

-- 
You received this message because you are subscribed to the Google
Groups "Oracle PL/SQL" group.
To post to this group, send email to Oracle-PLSQL@googlegroups.com
To unsubscribe from this group, send email to
oracle-plsql-unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/Oracle-PLSQL?hl=en

Reply via email to