Bugs item #2816059, was opened at 2009-07-03 02:48
Message generated for change (Tracker Item Submitted) made by skinkie
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2816059&group_id=56967

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: SQL/Core
Group: SQL CVS Head
Status: Open
Resolution: None
Priority: 5
Private: Yes
Submitted By: Stefan de Konink (skinkie)
Assigned to: Niels Nes (nielsnes)
Summary: Performance slowdown on or, direct information loss union

Initial Comment:
I received a take down notice of the KVK, until that is resolved I have to keep 
this academic...

With a new data provider aka real partner without legal battles; new bugs arise:

CREATE TABLE "sys"."faillissementen" (
        "kvk" bigint,
        "bedrijfsnaam" varchar(255),
        "plaats" varchar(32),
        "rechtbank" varchar(64),
        "rolnummer" varchar(16),
        "status" varchar(16),
        "datum" date
);

CREATE TABLE "sys"."kvk" (
        "kvk" bigint,
        "bedrijfsnaam" varchar(255),
        "adres" varchar(64),
        "postcode" varchar(6),
        "plaats" varchar(32),
        "type" varchar(16)
);

select * from faillissementen, kvk where faillissementen.kvk = kvk.kvk;
0 tuples
Timer      25.549 msec 0 rows

sql>select * from faillissementen, kvk where (faillissementen.kvk*10000) = 
kvk.kvk;
22 tuples
Timer      44.483 msec 22 rows

So far fine; now... for some reason nobody really cares to normalise their 
numbers so I want to join the two.
select * from faillissementen, kvk where (faillissementen.kvk*10000) = kvk.kvk 
or faillissementen.kvk = kvk.kvk;

Query gets totally out of bounds.

Now you may ask; so what if you do it like first year students?
select * from faillissementen, kvk where (faillissementen.kvk*10000) = kvk.kvk 
union select * from faillissementen, kvk where faillissementen.kvk = kvk.kvk;
20 tuples
Timer      57.804 msec 20 rows

Now you see that there are 20 results instead of 22. Basically what misses are 
the multiple results of the kvk side matching the number; The most trivial 
example:

sql>select kvk, adres from kvk where kvk = 10926910000;
+-------------+-------------------------+
| kvk         | adres                   |
+=============+=========================+
| 10926910000 | Anthonie Fokkerstraat 5 |
| 10926910000 | Anthonie Fokkerstraat 5 |
+-------------+-------------------------+

sql>select kvk.kvk, kvk.adres from faillissementen, kvk where 
(faillissementen.kvk*10000) = kvk.kvk union select kvk.kvk, kvk.adres from 
faillissementen, kvk where faillissementen.kvk = kvk.kvk;
+--------------+---------------------------+
| kvk          | adres                     |
+==============+===========================+
|  10926910000 | Anthonie Fokkerstraat 5   |
|  60680140000 | Spanjaardslaan 25 -1      |

sql>select kvk.kvk, kvk.adres from faillissementen, kvk where 
(faillissementen.kvk*10000) = kvk.kvk;
+--------------+---------------------------+
| kvk          | adres                     |
+==============+===========================+
|  10926910000 | Anthonie Fokkerstraat 5   |
|  10926910000 | Anthonie Fokkerstraat 5   |


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=482468&aid=2816059&group_id=56967

------------------------------------------------------------------------------
_______________________________________________
Monetdb-bugs mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/monetdb-bugs

Reply via email to