Raphael Matthias Krug wrote:
Sasha

P.S. I have a theory that a habit of printing computer documentation is a road block to becoming a "guru". At least, I have not yet encountered a "guru" that printed much, while at the same time it seems like a struggling user prints a lot. You cannot be 100% sure about the cause and effect relationship, though, but trying to go printless might activate something that speeds up skill acquisition.


I just printed the soundex-parts. This was ten lines :-). For understanding my problem, see the text below.
Shawn & Sasha
I am working with medieval sources, so called taxbooks. They contain names, taxamounts and other administrative entries. For my research I took nine of these taxbooks. One of my aims is to find out, if many taxpayers died or moved or simply stayed, e.g. with diseases. For this purpose, I inserted every taxbook in one table.
To compare the persons in this book, a friend created a php-script/file which takes from one book the names and compares them with the other books using right now a normal select-statement. The result is on the left a name and then as a table for each taxbook a row and if the name appears a 1.

Ralph:

I believe it is possible to get the results you want using an SQL query, but you would need to organize your data in a different way. You will need to write a script either in PHP or in some other language that will parse out your files and index the soundexes (or some other phonetic encodings) of the names. You will need a structure that looks something like this:

create table soundex_idx(col_soundex char(4) not null, doc_id int not null, num_instances int not null, unique key(col_soundex,doc_id));

create table name_soundex(col_soundex char(4) not null, name varchar(30) not null, key(col_soundex));

Then, for example, you want to see all the names in document 1 that also occur in the document 2 (with the soundex defined equivalence) you can do the following:

select distinct name_soundex.name from soundex_idx a, soundex_idx b,name_soundex where a.doc_id = 1 and b.doc_id = 2 and a.col_soundex = b.col_soundex and name_soundex.col_soundex = a.col_soundex

--
Sasha Pachev
Create online surveys at http://www.surveyz.com/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]



Reply via email to