Hi Pascal,

Sorry to hear that it does not interest you.

I did some changes locally that seem to do the job. That is, as far as
part 1 below is concerned. A correct approach to this part should make
part 2 obsolete. It is nice to know where game dates seem to be
inconsistent to player's dates of birth or death, but this can be
derived from a second run. Such second run should only list those. For
now I consider e.g. setting a game list filter to those inconsistencies
as nice-to-have and slightly off-topic.

I am not aware of a specific how-to to communicate my changes. Can you
put me on track somehow?
For now I attached a CVS patch.

The changes I did are trivial:
1) The spellcheck code itself (in src/tkscid.cpp/sc_name_spellcheck) has
the max number of corrections to make as an optional [-max N] positional
parameter. In the absence of such parameter, a hard-coded maximum of
20.000 is applied. Changing this value to (uint)-1 voids this hard-coded
limitation.

Having said this, in src/common.h I see the following statement:
        typedef unsigned int uint;   // 32 bit unsigned

This is a bit unfortunate, since sizeof(int) is not necessarily 32. It
may, or may not. Although we are probably safe here, it is better to use
"long" explicitly where at least 32 bits are required.

2) In the tcl layer there are two entry points for the above:
In tcl/file/maint.tcl/doCleaner the spellchecker is invoked as part of
the "Cleaner..." operation, currently with a maximum of 10.000
corrections. The -max parameter can be removed, or even be replaced by
some user setting.
In tcl/file/spellcheck.tcl/openSpellCheckWin and updateSpellCheckWin the
operation is called (for "Player...") with the maximum of 2000.
Remove/change this -max as well.

I positively tested on Linux (3.7M games, 84367 corrections in one go
applied 5.5 million times, leaving 1367 inconsistencies).

Not tested on other platforms...

Cheers,
Joost.

On Tue, 2009-02-17 at 20:32 +0100, Pascal Georges wrote:
> 
> 
> 2009/2/16 Joost ´t Hart <joost.t.h...@planet.nl>
>         Hi there!
>         
>         [scid3.6.26 - WinXP]
>         
>         After importing a huge number of games (about 3.7M) into a
>         fresh
>         database, I wanted to do the name corrections.
>         
>         1) Is there any particular reason why scid corrects only 2000
>         names in a
>         go? Might mean I have to run this dialog about 70 times (!) to
>         correct
>         the complete database...
> 
> Yes, this part is sub-optimal. Anybody wanting to look at it ?
>  
>         
>         
>         2) It seems that scid runs into (non-ambiguous) replacements
>         that it
>         cannot do after all (might be e.g. corrections to games that
>         this player
>         cannot have played because they are outside his time of life).
>         This is a clever precaution, but these player names keep on
>         returning in
>         the to-be-corrected list, reducing the numer of corrections to
>         make even
>         further...
>         
>         What can I do to complete this procedure a bit more
>         efficiently?
> 
> The code has to be changed and the process for names checking : this
> does not interest me, so volunteers ?
> 
> Pascal
> 
> ------------------------------------------------------------------------------
> Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
> -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
> -Strategies to boost innovation and cut costs with open source participation
> -Receive a $600 discount off the registration fee with the source code: SFAD
> http://p.sf.net/sfu/XcvMzF8H
> _______________________________________________ Scid-users mailing list 
> Scid-users@lists.sourceforge.net 
> https://lists.sourceforge.net/lists/listinfo/scid-users
Index: src/tkscid.cpp
===================================================================
RCS file: /cvsroot/scid/scid/src/tkscid.cpp,v
retrieving revision 1.29
diff -U5 -r1.29 tkscid.cpp
--- src/tkscid.cpp	10 Jan 2009 17:51:57 -0000	1.29
+++ src/tkscid.cpp	18 Feb 2009 14:27:17 -0000
@@ -12205,11 +12205,11 @@
 int
 sc_name_spellcheck (ClientData cd, Tcl_Interp * ti, int argc, const char ** argv)
 {
 #ifndef WINCE
     nameT nt = NAME_INVALID;
-    uint maxCorrections = 20000;
+    uint maxCorrections = (uint)-1;
     bool doSurnames = false;
     bool ambiguous = true;
     const char * usage = "Usage: sc_name spellcheck [-max <integer>] [-surnames <boolean>] [-ambiguous <boolean>] players|events|sites|rounds";
 
     const char * options[] = {
Index: tcl/file/maint.tcl
===================================================================
RCS file: /cvsroot/scid/scid/tcl/file/maint.tcl,v
retrieving revision 1.11
diff -U5 -r1.11 maint.tcl
--- tcl/file/maint.tcl	11 Jan 2009 16:03:29 -0000	1.11
+++ tcl/file/maint.tcl	18 Feb 2009 14:27:17 -0000
@@ -1481,11 +1481,11 @@
     set tag [string tolower $names]
     if {$cleaner($tag)} {
       mtoolAdd $t "$count: $::tr(Spellchecking): $::tr($names)..."
       incr count
       set result "0 $nameType names were corrected."
-      if {! [catch {sc_name spellcheck -max 100000 $nameType} corrections]} {
+      if {! [catch {sc_name spellcheck $nameType} corrections]} {
         update
         catch {sc_name correct $nameType $corrections} result
       }
       $t insert end "   $result\n"
       $t see end
Index: tcl/file/spellchk.tcl
===================================================================
RCS file: /cvsroot/scid/scid/tcl/file/spellchk.tcl,v
retrieving revision 1.3
diff -U5 -r1.3 spellchk.tcl
--- tcl/file/spellchk.tcl	23 Dec 2008 20:10:46 -0000	1.3
+++ tcl/file/spellchk.tcl	18 Feb 2009 14:27:17 -0000
@@ -39,11 +39,11 @@
   global spellcheckAmbiguous
   busyCursor .
   .spellcheckWin.text.text delete 1.0 end
   #.spellcheckWin.text.text insert end "Finding player corrections..."
   update idletasks
-  catch {sc_name spellcheck -max $spell_maxCorrections \
+  catch {sc_name spellcheck \
            -surnames $spellcheckSurnames \
            -ambiguous $spellcheckAmbiguous $type} result
   .spellcheckWin.text.text delete 1.0 end
   .spellcheckWin.text.text insert end $result
   unbusyCursor .
@@ -64,11 +64,11 @@
     if {![readSpellCheckFile]} {
       return
     }
   }
   busyCursor .
-  if {[catch {sc_name spellcheck -max $spell_maxCorrections \
+  if {[catch {sc_name spellcheck \
                 -surnames $spellcheckSurnames \
                 -ambiguous $spellcheckAmbiguous $type} result]} {
     unbusyCursor .
     tk_messageBox -type ok -icon info -title "Scid: Spellcheck results" \
       -parent $parent -message $result
------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
Scid-users mailing list
Scid-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scid-users

Reply via email to