Re: [PATCHES] Configurable Penalty Costs for Levenshtein

2008-04-03 Thread Volkan YAZICI
On Thu, 03 Apr 2008, Tom Lane <[EMAIL PROTECTED]> writes:
> Volkan YAZICI <[EMAIL PROTECTED]> writes:
>> Sorry for the delay, but the reply of Tom didn't reach me. I've modified
>> the patch according to Tom's comments. I hope I am not too late.
>
> Applied after considerable revision.  This patch:
>
> * introduced a memory stomp that was not there before (I strongly
>   recommend testing C code in an --enable-cassert build)
> * added a user-visible feature without documenting it
> * undid a conflicting patch that had been applied since your first version
> * removed a number of useful comments from the code
>
> I cleaned it up and applied anyway, but generally we expect a higher
> quality standard for patches that are claimed to be ready to apply.

Thanks so much for your kindness. Please don't hesistate to reject the
patch next time by dropping me an email with the above lines mentioning
about your considerations, and I'll happily fix it at my best and resend
it. I don't want to interrupt your work with such trivial stuff.


Regards.

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Configurable Penalty Costs for Levenshtein

2008-04-03 Thread Tom Lane
Volkan YAZICI <[EMAIL PROTECTED]> writes:
> Sorry for the delay, but the reply of Tom didn't reach me. I've modified
> the patch according to Tom's comments. I hope I am not too late.

Applied after considerable revision.  This patch:

* introduced a memory stomp that was not there before (I strongly
  recommend testing C code in an --enable-cassert build)
* added a user-visible feature without documenting it
* undid a conflicting patch that had been applied since your first version
* removed a number of useful comments from the code

I cleaned it up and applied anyway, but generally we expect a higher
quality standard for patches that are claimed to be ready to apply.

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Configurable Penalty Costs for Levenshtein

2008-04-03 Thread Bruce Momjian
Volkan YAZICI wrote:
> Hi,
> 
> Sorry for the delay, but the reply of Tom didn't reach me. I've modified
> the patch according to Tom's comments. I hope I am not too late.

OK, great. I will re-add it to the current queue and add this email as
well.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Configurable Penalty Costs for Levenshtein

2008-04-02 Thread Volkan YAZICI
Hi,

Sorry for the delay, but the reply of Tom didn't reach me. I've modified
the patch according to Tom's comments. I hope I am not too late.


Regards.



fuzzystrmatch-levenshtein.patch.1
Description: Binary data

On Wed, 2 Apr 2008, Bruce Momjian <[EMAIL PROTECTED]> writes:
> Because of lack of reply from the author, this has been saved for the
> next commit-fest:
>
>   http://momjian.postgresql.org/cgi-bin/pgpatches_hold
>
> ---
>
> Tom Lane wrote:
>> Volkan YAZICI <[EMAIL PROTECTED]> writes:
>> > I noticed a small typo in the patch.
>> >   prev = palloc((m + n) * sizeof(char));
>> > line should look like
>> >   prev = palloc(2 * m * sizeof(char));
>> > instead.
>> 
>> If that's wrong, aren't the comments and the length restriction limit
>> also wrong?

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Configurable Penalty Costs for Levenshtein

2008-04-02 Thread Bruce Momjian

Because of lack of reply from the author, this has been saved for the
next commit-fest:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Tom Lane wrote:
> Volkan YAZICI <[EMAIL PROTECTED]> writes:
> > I noticed a small typo in the patch.
> >   prev = palloc((m + n) * sizeof(char));
> > line should look like
> >   prev = palloc(2 * m * sizeof(char));
> > instead.
> 
> If that's wrong, aren't the comments and the length restriction limit
> also wrong?
> 
>   regards, tom lane
> 
> -- 
> Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-patches

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Configurable Penalty Costs for Levenshtein

2008-03-11 Thread Tom Lane
Volkan YAZICI <[EMAIL PROTECTED]> writes:
> I noticed a small typo in the patch.
>   prev = palloc((m + n) * sizeof(char));
> line should look like
>   prev = palloc(2 * m * sizeof(char));
> instead.

If that's wrong, aren't the comments and the length restriction limit
also wrong?

regards, tom lane

-- 
Sent via pgsql-patches mailing list (pgsql-patches@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-patches


Re: [PATCHES] Configurable Penalty Costs for Levenshtein

2007-11-28 Thread Bruce Momjian

This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Volkan YAZICI wrote:
> Hi,
> 
> I noticed a small typo in the patch.
> 
>   prev = palloc((m + n) * sizeof(char));
> 
> line should look like
> 
>   prev = palloc(2 * m * sizeof(char));
> 
> instead.
> 
> 
> Regards.
> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] Configurable Penalty Costs for Levenshtein

2007-11-28 Thread Volkan YAZICI
Hi,

I noticed a small typo in the patch.

  prev = palloc((m + n) * sizeof(char));

line should look like

  prev = palloc(2 * m * sizeof(char));

instead.


Regards.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] Configurable Penalty Costs for Levenshtein

2007-11-05 Thread Bruce Momjian

This has been saved for the 8.4 release:

http://momjian.postgresql.org/cgi-bin/pgpatches_hold

---

Volkan YAZICI wrote:
> Hi,
> 
> Following patch implements configurable penalty costs for levenshtein
> distance metric in fuzzystrmatch contrib module.
> 
> It doesn't introduce any compatibility issues. Just implements
> 
>   levenshtein(text,text,int,int,int)
> 
> function on top of fuzzystrmatch.c:levenshtein_internal(). At the
> same time, levenshtein(text,text) becomes just a wrapper for
> levenshtein(text,text,1,1,1) call.
> 
> BTW, in current CVS tip
> 
>   if (cols/rows == 0) ...
> 
> checks in fuzzyztrmatch.c:levenshtein() never fail because of
> 
>   cols/rows = strlen(...) + 1;
> 
> FYI.
> 
> 
> Regards.

Content-Description: Configurable Penalty Costs for Levenshtein


> 
> ---(end of broadcast)---
> TIP 7: You can help support the PostgreSQL project by donating at
> 
> http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] Configurable Penalty Costs for Levenshtein

2007-11-01 Thread Tom Lane
Volkan YAZICI <[EMAIL PROTECTED]> writes:
> Volkan YAZICI <[EMAIL PROTECTED]> writes:
>> Following patch implements configurable penalty costs for levenshtein
>> distance metric in fuzzystrmatch contrib module.

> Is there a problem with the patch? Would anybody mind helping me to
> figure out the reason of this lack of interest, after 15 days.

The reason is the project schedule: we're trying to get 8.3 out the
door.  No one is likely to pay any attention to new-feature patches
until after 8.4 development begins.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] Configurable Penalty Costs for Levenshtein

2007-11-01 Thread Bruce Momjian
Volkan YAZICI wrote:
> Volkan YAZICI <[EMAIL PROTECTED]> writes:
> > Following patch implements configurable penalty costs for levenshtein
> > distance metric in fuzzystrmatch contrib module.
> 
> Is there a problem with the patch? Would anybody mind helping me to
> figure out the reason of this lack of interest, after 15 days.

I will review it in the next few days.  Sorry.  I am just doing a
cleanup now.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Configurable Penalty Costs for Levenshtein

2007-11-01 Thread Volkan YAZICI
Volkan YAZICI <[EMAIL PROTECTED]> writes:
> Following patch implements configurable penalty costs for levenshtein
> distance metric in fuzzystrmatch contrib module.

Is there a problem with the patch? Would anybody mind helping me to
figure out the reason of this lack of interest, after 15 days.


Regards.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PATCHES] Configurable Penalty Costs for Levenshtein

2007-10-17 Thread Volkan YAZICI
Hi,

Following patch implements configurable penalty costs for levenshtein
distance metric in fuzzystrmatch contrib module.

It doesn't introduce any compatibility issues. Just implements

  levenshtein(text,text,int,int,int)

function on top of fuzzystrmatch.c:levenshtein_internal(). At the
same time, levenshtein(text,text) becomes just a wrapper for
levenshtein(text,text,1,1,1) call.

BTW, in current CVS tip

  if (cols/rows == 0) ...

checks in fuzzyztrmatch.c:levenshtein() never fail because of

  cols/rows = strlen(...) + 1;

FYI.


Regards.
? contrib/fuzzystrmatch/fuzzystrmatch.sql
? contrib/fuzzystrmatch/libfuzzystrmatch.so.0.0
Index: contrib/fuzzystrmatch/README.fuzzystrmatch
===
RCS file: /projects/cvsroot/pgsql/contrib/fuzzystrmatch/README.fuzzystrmatch,v
retrieving revision 1.10
diff -c -r1.10 README.fuzzystrmatch
*** contrib/fuzzystrmatch/README.fuzzystrmatch	5 Jan 2007 22:19:17 -	1.10
--- contrib/fuzzystrmatch/README.fuzzystrmatch	17 Oct 2007 13:58:09 -
***
*** 14,19 
--- 14,20 
   * found at http://www.merriampark.com/ld.htm
   * Also looked at levenshtein.c in the PHP 4.0.6 distribution for
   * inspiration.
+  * Configurable penalty costs extension is introduced by Volkan YAZICI.
   *
   * metaphone()
   * ---
***
*** 116,121 
--- 117,158 
  ==
  Name
  
+ levenshtein -- calculates levenshtein distance with respect
+to specified costs.
+ 
+ Synopsis
+ 
+ levenshtein(text source, text target,
+ int insert_cost, int delete_cost, int substitution_cost)
+ 
+ Inputs
+ 
+   source
+ any text string, 255 characters max, NOT NULL
+ 
+   target
+ any text string, 255 characters max, NOT NULL
+ 
+   insert_cost
+ cost of extra inserted characters
+ 
+   delete_cost
+ cost of missing characters
+ 
+   substitution_cost
+ cost of character substitutions
+ 
+ Outputs
+ 
+   Returns int
+ 
+ Example usage
+ 
+   select levenshtein('GUMBO','GAMBOL', 1, 1, 3);
+ 
+ ==
+ Name
+ 
  metaphone -- calculates the metaphone code of an input string
  
  Synopsis
***
*** 140,144 
select metaphone('GUMBO',4);
  
  ==
! -- Joe Conway
! 
--- 177,180 
select metaphone('GUMBO',4);
  
  ==
! -- Joe Conway
\ No newline at end of file
Index: contrib/fuzzystrmatch/fuzzystrmatch.c
===
RCS file: /projects/cvsroot/pgsql/contrib/fuzzystrmatch/fuzzystrmatch.c,v
retrieving revision 1.24
diff -c -r1.24 fuzzystrmatch.c
*** contrib/fuzzystrmatch/fuzzystrmatch.c	13 Feb 2007 18:00:35 -	1.24
--- contrib/fuzzystrmatch/fuzzystrmatch.c	17 Oct 2007 13:58:09 -
***
*** 15,20 
--- 15,21 
   * found at http://www.merriampark.com/ld.htm
   * Also looked at levenshtein.c in the PHP 4.0.6 distribution for
   * inspiration.
+  * Configurable penalty consts extension is introduced by Volkan YAZICI.
   *
   * metaphone()
   * ---
***
*** 47,201 
  
  PG_MODULE_MAGIC;
  
  /*
!  * Calculates Levenshtein Distance between two strings.
!  * Uses simplest and fastest cost model only, i.e. assumes a cost of 1 for
!  * each deletion, substitution, or insertion.
   */
! PG_FUNCTION_INFO_V1(levenshtein);
! Datum
! levenshtein(PG_FUNCTION_ARGS)
  {
! 	char	   *str_s;
! 	char	   *str_s0;
! 	char	   *str_t;
! 	int			cols = 0;
! 	int			rows = 0;
! 	int		   *u_cells;
! 	int		   *l_cells;
! 	int		   *tmp;
! 	int			i;
! 	int			j;
! 
! 	/*
! 	 * Fetch the arguments. str_s is referred to as the "source" cols = length
! 	 * of source + 1 to allow for the initialization column str_t is referred
! 	 * to as the "target", rows = length of target + 1 rows = length of target
! 	 * + 1 to allow for the initialization row
! 	 */
! 	str_s = DatumGetCString(DirectFunctionCall1(textout, PointerGetDatum(PG_GETARG_TEXT_P(0;
! 	str_t = DatumGetCString(DirectFunctionCall1(textout, PointerGetDatum(PG_GETARG_TEXT_P(1;
! 
! 	cols = strlen(str_s) + 1;
! 	rows = strlen(str_t) + 1;
  
  	/*
! 	 * Restrict the length of the strings being compared to something
! 	 * reasonable because we will have to perform rows * cols calculations. If
! 	 * longer strings need to be compared, increase MAX_LEVENSHTEIN_STRLEN to
! 	 * suit (but within your tolerance for speed and memory usage).
  	 */
! 	if ((cols > MAX_LEVENSHTEIN_STRLEN + 1) || (rows > MAX_LEVENSHTEIN_STRLEN + 1))
  		ereport(ERROR,
  (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
   errmsg("argument exceeds the maximum length of %d bytes",
  		MAX_LEVENSHTEIN_STRLEN)));
  
! 	/*
! 	 * If either rows or cols is 0, the answer is the other value. This makes
! 	 * sen