Andrea, here are some problems and their solutions:

(1) Where do you get a "tolower" function that takes a char* and returns a 
char*?  The one in ctypes.h works a character at a time (thus takes an int and 
returns an int).

(2) Why is the entire sqlite3 engine included in an dll which is loaded as an 
extension to sqlite3?

Please follow along and I will show you where we can find and fix these issues 
... this might be helpful to other extension writers as well.

First, the -std=c99 needs to be -std=gnu99 to permit the gnu extension 
functions to be recognized.  
Without the GNU extensions a bunch of non-ansi names are not recognized because 
c99 implies -ansi.  Once this change is made 99% of the errors go away.

src\wrapper_functions.c: In function 'stringmetricsFunc':
src\wrapper_functions.c:350:16: warning: 'return' with a value, in function 
returning void [enabled by default]
                return (1);
                ^

This is easy.  SQLite scalar functions are supposed to return an int status 
code.  That code is either SQLITE_ERR if there was an error, or SQLITE_OK if 
everything is OK.  So change the function definition to return an int, and the 
two return statements to return SQLITE_ERR (not 1) and SQLITE_OK (not nothing).


src\wrapper_functions.c:353:4: warning: implicit declaration of function 
'tolower' [-Wimplicit-function-declaration]
    if(strcmp(tolower(kindofoutput),"similarity")==0) {
    ^
src\wrapper_functions.c:353:4: warning: passing argument 1 of 'strcmp' makes 
pointer from integer without a cast [enabled by default]
In file included from src\wrapper_functions.c:57:0:
c:\apps\mingw\include\string.h:55:37: note: expected 'const char *' but 
argument is of type 'int'
 _CRTIMP int __cdecl __MINGW_NOTHROW strcmp (const char*, const char*)  
__MINGW_ATTRIB_PURE;
                                     ^
src\wrapper_functions.c:355:4: warning: passing argument 1 of 'strcmp' makes 
pointer from integer without a cast [enabled by default]
    } else if(strcmp(tolower(kindofoutput),"metric")==0) {
    ^
In file included from src\wrapper_functions.c:57:0:
c:\apps\mingw\include\string.h:55:37: note: expected 'const char *' but 
argument is of type 'int'
 _CRTIMP int __cdecl __MINGW_NOTHROW strcmp (const char*, const char*)  
__MINGW_ATTRIB_PURE;


The "tolower" function works on an int (single character) and returns an int 
(single character).  It does not work on whole strings.  The function for doing 
a case insensitive string compare is "stricmp":

This can be fixed by making the following changes in wrapper_functions.c:

        if(kindofoutput!=NULL) {
            if(stricmp(kindofoutput,"similarity")==0) {
                sqlite3_result_double(context, similarity);
            } else if(stricmp(kindofoutput,"metric")==0) {
                sqlite3_result_text(context, metrics, strlen(metrics)+1, NULL);
            } else {
                mex = malloc(strlen(sm_name) + 200 + strlen(metrics)+1);
                sprintf(mex,"%s between \"%s\" & \"%s\" is \"%s\" and yields a 
%3.0f%% similarity",sm_name,par1,par2,metrics,similarity*100);
                sqlite3_result_text(context, mex, strlen(mex)+1, NULL);
            }
        } else {
            mex = malloc(strlen(sm_name) + 200 + strlen(metrics)+1);
            sprintf(mex,"%s between \"%s\" & \"%s\" is \"%s\" and yields a 
%3.0f%% similarity",sm_name,par1,par2,metrics,similarity*100);
            sqlite3_result_text(context, mex, strlen(mex)+1, NULL);

(basically a global search and replace for "strcmp(tolower(kindofoutput)," and 
replacing it with "stricmp(kindofoutput,")


Now we have left only the problem that the entirety of SQLite3 itself is 
compiled into the extension.

Since we are not compiling the extension into the core, you simply need to use 
the correct header.  "wrapper_functions.c" should be using sqlite3ext.h, not 
sqlite3.h.  You then need to add a macro to get a reference to the sqlite3_api 
thus:


#include <sqlite3ext.h>
#include <string.h>
#include <stdlib.h>
#include <malloc.h>
#include <stddef.h>
#include "simmetrics.h"

SQLITE_EXTENSION_INIT3

    const int SIMMETC = 27;


SQLITE_EXTENSION_INIT1 creates the "sqlite3_api" pointer.  
SQLITE3_EXTENSION_INIT2 initializes its value.  If you need to access the 
"sqlite3_api" in a source file which is "linked with" something which has a 
declaration and initialization of sqlite3_api, then you just put in the 
SQLITE_EXTENSION_INIT3 macro at the top of those modules.  (The definitions are 
at the end of sqlite3ext.h)

You then change the compile command thusly:

 gcc -s -O3 -std=gnu99 -mdll -mthreads -Bl,--static -static-libgcc 
     -I src 
     -I src\libsimmetrics\include 
     -I ..\sqlite\dist 
     src\*.c 
     src\libsimmetrics\simmetrics\*.c 
     -o stringmetrics.dll

where "..\sqlite\dist" is the location of the sqlite3 header files (I point 
them to my own SQLite3 build directories, you can carry an extra copy in the 
src/sqlite3 directory and refer to those if you prefer).  This produces a 73K 
extension module with no external dependancies (other than to the MSVCRT.DLL 
subsystem runtime library) and produces no diagnostic output.

I added the -mthreads because I presume this may be used in a multithread 
environment.  It added no linkage to the thread library code, so I assume the 
base functions used were already thread-safe (or could not be made so).  I 
haven't looked into which is the case.

After these changes we get the following (on Win81 x64, with the current MingW 
32-bit compiler) (slightly reformatted to fit your screen):

2014-09-28 14:05:30 [D:\Source\libstringmetrics-master]
>gcc -s -O3 -std=gnu99 -mdll -mthreads -Bl,--static -static-libgcc 
     -I src 
     -I src\libsimmetrics\include 
     -I ..\sqlite\dist 
     src\*.c 
     src\libsimmetrics\simmetrics\*.c 
     -o stringmetrics.dll

and comparing this extension to the original included in the distribution (I 
stripped it, so it is smaller than the one in the distribution because the 
internal symbol table is gone)

2014-09-28 14:05:33 [D:\Source\libstringmetrics-master]
>dir *.dll

2014-09-28  12:57           769,038 libstringmetrics.dll
2014-09-28  14:05            75,776 stringmetrics.dll

and running it:

2014-09-28 13:55:55 [D:\Source\libstringmetrics-master]
>sqlite
SQLite version 3.8.7 2014-09-26 18:30:11
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .load stringmetrics
sqlite> .echo on
sqlite> .read test.sql
.read test.sql
select load_extension("libstringmetrics.dll");

select stringmetrics("block_distance_custom","phrase","via 
giuseppe-garibaldi,25", "via giuseppe garibaldi 25",",-");
Block Distance customized between "via giuseppe-garibaldi,25" & "via giuseppe 
garibaldi 25" is "0" and yields a 100% similarity
select stringmetrics("cosine_custom","phrase","via giuseppe-garibaldi,25", "via 
giuseppe garibaldi 25",",-");
Cosine Similarity customized between "via giuseppe-garibaldi,25" & "via 
giuseppe garibaldi 25" is "1.000000" and yields a 100% similarity
select stringmetrics("dice_custom","phrase","via giuseppe-garibaldi,25", "via 
giuseppe garibaldi 25",",-");
Dice Similarity customized between "via giuseppe-garibaldi,25" & "via giuseppe 
garibaldi 25" is "1.000000" and yields a 100% similarity
select stringmetrics("euclidean_distance","phrase","via giuseppe-garibaldi,25", 
"via giuseppe garibaldi 25",",-");
Euclidean Distance between "via giuseppe-garibaldi,25" & "via giuseppe 
garibaldi 25" is "2.00" and yields a  55% similarity
select stringmetrics("euclidean_distance_custom","phrase","via 
giuseppe-garibaldi,25", "via giuseppe garibaldi 25",",-");
Euclidean Distance customized between "via giuseppe-garibaldi,25" & "via 
giuseppe garibaldi 25" is "0" and yields a 100% similarity
select stringmetrics("jaccard","phrase","via giuseppe-garibaldi,25", "via 
giuseppe garibaldi 25",",-");
Jaccard Similarity between "via giuseppe-garibaldi,25" & "via giuseppe 
garibaldi 25" is "0.200000" and yields a  20% similarity
select stringmetrics("jaccard_custom","phrase","via giuseppe-garibaldi,25", 
"via giuseppe garibaldi 25",",-");
Jaccard Similarity customized between "via giuseppe-garibaldi,25" & "via 
giuseppe garibaldi 25" is "1.000000" and yields a 100% similarity
select stringmetrics("jaro","phrase","via giuseppe-garibaldi,25", "via giuseppe 
garibaldi 25",",-");
Jaro Similarity between "via giuseppe-garibaldi,25" & "via giuseppe garibaldi 
25" is "0.920000" and yields a  92% similarity
select stringmetrics("jaro_winkler","phrase","via giuseppe-garibaldi,25", "via 
giuseppe garibaldi 25",",-");
Jaro Winkler Similarity between "via giuseppe-garibaldi,25" & "via giuseppe 
garibaldi 25" is "0.968000" and yields a  97% similarity
select stringmetrics("levenshtein","phrase","via giuseppe-garibaldi,25", "via 
giuseppe garibaldi 25",",-");
Levenshtein Distance between "via giuseppe-garibaldi,25" & "via giuseppe 
garibaldi 25" is "2" and yields a  92% similarity
select stringmetrics("matching_coefficient","phrase","via 
giuseppe-garibaldi,25", "via giuseppe garibaldi 25",",-");
Matching Coefficient SimMetrics between "via giuseppe-garibaldi,25" & "via 
giuseppe garibaldi 25" is "1.00" and yields a  25% similarity
select stringmetrics("matching_coefficient_custom","phrase","via 
giuseppe-garibaldi,25", "via giuseppe garibaldi 25",",-");
Matching Coefficient SimMetrics customized between "via giuseppe-garibaldi,25" 
& "via giuseppe garibaldi 25" is "4.00" and yields a 100% sim
ilarity
select stringmetrics("monge_elkan","phrase","via giuseppe-garibaldi,25", "via 
giuseppe garibaldi 25",",-");
Monge Elkan Similarity between "via giuseppe-garibaldi,25" & "via giuseppe 
garibaldi 25" is "1.012500" and yields a 101% similarity
select stringmetrics("monge_elkan_custom","phrase","via giuseppe-garibaldi,25", 
"via giuseppe garibaldi 25",",-");
Matching Coefficient SimMetrics customized STILL NOT IMPLEMENTED between "via 
giuseppe-garibaldi,25" & "via giuseppe garibaldi 25" is "still
 not implemented" and yields a   0% similarity
select stringmetrics("needleman_wunch","phrase","via giuseppe-garibaldi,25", 
"via giuseppe garibaldi 25",",-");
Needleman Wunch SimMetrics between "via giuseppe-garibaldi,25" & "via giuseppe 
garibaldi 25" is "2.00" and yields a  96% similarity
select stringmetrics("overlap_coefficient","phrase","via 
giuseppe-garibaldi,25", "via giuseppe garibaldi 25",",-");
Overlap Coefficient Similarity between "via giuseppe-garibaldi,25" & "via 
giuseppe garibaldi 25" is "0.500000" and yields a  50% similarity
select stringmetrics("overlap_coefficient_custom","phrase","via 
giuseppe-garibaldi,25", "via giuseppe garibaldi 25",",-");
Overlap Coefficient Similarity customized between "via giuseppe-garibaldi,25" & 
"via giuseppe garibaldi 25" is "1.000000" and yields a 100%
similarity
select stringmetrics("qgrams_distance","phrase","via giuseppe-garibaldi,25", 
"via giuseppe garibaldi 25",",-");
QGrams Distance between "via giuseppe-garibaldi,25" & "via giuseppe garibaldi 
25" is "12" and yields a  78% similarity
select stringmetrics("qgrams_distance_custom","phrase","via 
giuseppe-garibaldi,25", "via giuseppe garibaldi 25",",-");
QGrams Distance customized between "via giuseppe-garibaldi,25" & "via giuseppe 
garibaldi 25" is "0" and yields a 100% similarity
select stringmetrics("smith_waterman","phrase","via giuseppe-garibaldi,25", 
"via giuseppe garibaldi 25",",-");
Smith Waterman SimMetrics between "via giuseppe-garibaldi,25" & "via giuseppe 
garibaldi 25" is "21.00" and yields a  84% similarity
select stringmetrics("smith_waterman_gotoh","phrase","via 
giuseppe-garibaldi,25", "via giuseppe garibaldi 25",",-");
Smith Waterman Gotoh SimMetrics between "via giuseppe-garibaldi,25" & "via 
giuseppe garibaldi 25" is "109.00" and yields a  87% similarity
select stringmetrics("soundex_phonetics","phrase","via giuseppe-garibaldi,25", 
"via giuseppe garibaldi 25",",-");
Soundex Phonetics between "via giuseppe-garibaldi,25" & "via giuseppe garibaldi 
25" is "V221 & V221" and yields a 100% similarity
select stringmetrics("metaphone_phonetics","phrase","via 
giuseppe-garibaldi,25", "via giuseppe garibaldi 25",",-");
Metaphone Phonetics between "via giuseppe-garibaldi,25" & "via giuseppe 
garibaldi 25" is "FJSP & FJSP" and yields a 100% similarity
select stringmetrics("double_metaphone_phonetics","phrase","via 
giuseppe-garibaldi,25", "via giuseppe garibaldi 25",",-");
Double Metaphone Phonetics between "via giuseppe-garibaldi,25" & "via giuseppe 
garibaldi 25" is "FJSP & FJSP" and yields a 100% similarity

sqlite>



>-----Original Message-----
>From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-
>boun...@sqlite.org] On Behalf Of Andrea Peri
>Sent: Sunday, 28 September, 2014 02:53
>To: Gert Van Assche; General Discussion of SQLite Database
>Subject: Re: [sqlite] A new extension for sqlite to analyze the
>stringmetrics
>
>You should use SQLite 32bit
>Il 28/set/2014 10:45 "Gert Van Assche" <ger...@gmail.com> ha scritto:
>
>> Thanks Andrea.
>> When I download the DLL I get exactly the same error.
>> I'm using the 32bit SQLite3.exe on a Win 64 bit machine.
>> Could that cause the error?
>>
>> thanks
>>
>> gert
>>
>> 2014-09-27 20:27 GMT+02:00 Andrea Peri <aperi2...@gmail.com>:
>>
>>> https://github.com/aperi2007/libstringmetrics
>>>
>>>
>>> >Andrea, where do I find it?
>>> >
>>> >thanks
>>> >
>>> >gert
>>>
>>>
>>>
>>> --
>>> -----------------
>>> Andrea Peri
>>> . . . . . . . . .
>>> qwerty àèìòù
>>> -----------------
>>>
>>
>>
>_______________________________________________
>sqlite-users mailing list
>sqlite-users@sqlite.org
>http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to