Bug#820514: crafty: Please package latest upstream 25.0.1

Douglas Calvert Sat, 09 Apr 2016 03:33:30 -0700

Package: crafty
Version: 23.4-6+b1
Severity: normal

There have been a number of improvements in crafty since 23.4:


23.5   Several pieces of code cleanup, both for readability and speed.   
       We are now using stdint.h, which lets us specific the exact size  
       of an integer value which gets rid of the int vs long issues that 
       exist between MSVC and the rest of the world on X86_64.  We still 
       use things like "int" if we want to let the compiler choose the   
       most efficient size, but when we need a specific number of bits,  
       such as 64, we use a specific type (uint64_t for 64 bits).  Minor 
       change to forward pruning that does not toss out passed pawn      
       moves if they are advanced enough (compared to remaining depth)   
       that they might be worth searching.  Cleanup to passed pawn eval  
       code to help efficiency and readability.  New options to the      
       display command.  "display nothing" turns all output off except   
       for the move Crafty chooses to play.  "display everything" will   
       produce a veritable flood of information before it makes a move.  
       Cleanup of personality code to fix some values that were not      
       included, as well as clean up the formatting for simplicity.      
                                                                         
23.6   Minor tweak to "adaptive hash" code + a fix to the usage warning  
       that failed to explain how many parameters are required.  New way 
       of timing the search, replacing the old "easy move" code that was 
       simply too ineffective.  Now Crafty computes a "difficulty" level 
       after each iteration.  If the best move doesn't change, this      
       level is reduced by 10% of the current value, until it reaches    
       60% where it stops dropping.  If the best move changes during an  
       iteration, it adjusts in one of two ways.  If the current         
       difficulty level is less than 100%, it reverts to 100% + the      
       number of different best moves minus 1 times 20%.  If the current 
       difficulty level is already >= 100% it is set to 80% of the       
       current value + (again) the number of different best moves - 1    
       times 20%.  Repeated changes can run this up to 180% max.  As the 
       search progresses this difficulty level represents the percentage 
       of the nominal target time limit it should use.  It still tries   
       to complete the current iteration before quitting, so this limit  
       is a lower bound on the time it will use.  Restored an old idea   
       from earlier Crafty versions (and even Cray Blitz), that of       
       trying the killers from two plies earlier in the tree, once the   
       killers for the current ply have been tried.  Was a +10 Elo gain, 
       added to about +10 for the new time control logic.  Old fail-high 
       restriction removed.  At one point in time, a fail-high on the    
       null-window search at the root would cause problems.  Crafty was  
       modified so that the fail-high was ignored if the re-search       
       failed low.  Removing this produced an 8 Elo gain.                
                                                                         
23.7   Minor fix to time check code.  Was particularly broken at very    
       low skill level settings, but it had a small impact everywhere.   
       Some simplification with the reduction code, and how moves are    
       counted as they are searched, which would not count any moves     
       that were pruned, which could delay triggering reductions at a    
       ply.  Fixed significant killer move bug that would only surface   
       during a parallel search.  NextMove (remaining moves phase) culls 
       killer moves as it selects, because they have already been tried. 
       Unfortunately, in a parallel search, the killers could get        
       modified by other threads, which could erroneously cause a move   
       to be excluded that had not been searched.  I discovered this bug 
       on wac #266 where Crafty normally finds a mate at depth=11, but   
       a parallel search would often miss the mate completely.  Minor    
       fix to new "difficulty" time management.  It was possible for a   
       move to look "easy" (several iterations without finding a new     
       best move) but it could fail low at the beginning of the next     
       iteration.  Difficulty would be reduced due to not changing the   
       best move at previous iterations, so the search could time out    
       quicker than expected.  Simple fix was to reset difficulty to     
       100% on a fail low, which makes sense anyway if it was supposed   
       to be "easy" but the score is mysteriously dropping anyway.       
                                                                         
23.8   MAXPLY changed to 128, which increases the max length of the PV's 
       that can be displayed.  The hash path stuff works so well that it 
       often STILL has PVs with <HT> on the end, after 64 moves are      
       displayed.  While this can still probably be reached, it should   
       be much less frequent.  This does mean the search can go for 128  
       iterations, rather than the old 64 limit, of course.  Bug in      
       Search() that failed to update the PV on an EGTB hit, which would 
       cause the hashed PV to contain illegal moves.  When Crafty then   
       tried to display this PV, it would promptly crash.                
                                                                         
24.0   Major changes to portability/platform specific code.  Crafty now  
       supports Unix and Windows platforms.  All the other spaghetti     
       code has been removed, with an effort to make the code readable.  
       Couple of bugs in EvaluateWinningChances() fixed, one in          
       particular that makes the krp vs kr draw detection unreachable.   
       strcpy() replaced with memmove() to fix problems with Apple OS X  
       Mavericks where strcpy() with operands that overlap now causes an 
       application to abort on the spot.  Bug in Search() that caused an 
       inefficiency in the parallel search.  At some point the way moves 
       are counted within a ply was changed, which changed the split     
       algorithm unintentionally.  The original intent was to search     
       just one move and then allow splits.  The bug changed this to     
       search TWO moves before splitting is allowed, which slowed things 
       down quite a bit.  thread[i] is now an array of structures so     
       that the thread pointers are separated from each other to fit     
       into separate cache blocks, this avoids excessive cache traffic   
       since they get changed regularly and were sitting in a single     
       cache block with 8 cpus on 64 bit architectures.  Bug in new      
       "difficulty" time usage code would let the program run up against 
       "absolute_time_limit" before stopping the search, even if we      
       finished an iteration using beyond the normal target.  This       
       really burned a lot of unnecessary time.  I had previously fixed  
       the PVS window fail high (at the root) to force such a move to be 
       played in the game.  One unfortunate oversight however was that   
       when splitting at the root, N different processors could fail     
       high on N different moves, and they were searching those in       
       parallel.  I've modified this so that the PVS fail high at the    
       root exits from Search() (after stopping other busy threads) and  
       then returns to Iterate() which will print the fail-high notice   
       and re-start the search with all threads searching just one fail  
       high move to get a score quicker.  Fail-high/fail-low code clean- 
       up.  In particular, the "delta" value to increment beta or        
       decrement alpha is now two variables, so that an early fail-low   
       won't increment the delta value and then the subsequent (hope-    
       fully) fail-high starts off with a bad increment and jumps beta   
       too quickly.  All fail-highs at the root return to Iterate(),     
       even the PVS null-window fail-highs.  This re-starts the search   
       with all processors working on that one move to get it resolved   
       much quicker.  Other minor fixes such as when failing low on the  
       first move, we pop out of search and go back to Iterate() to      
       lower alpha and start a new search.  Unfortunately, after the     
       first move is searched, a parallel split at the root would occur, 
       only to be instantly aborted so that we could back out of Search()
       and then re-start.  Wasted the split time unnecessarily although  
       it was not a huge amount of time.  Another simplification to the  
       repetition-detection code while chasing a parallel search bug     
       was missing repetitions.                                          
                                                                         
24.1   LMR reductions changed to allow a more dynamic reduction amount   
       rather than just 1 or 2.  An array LMR_reductions[depth][moves]   
       is initialized at start-up and sets the reduction amount for a    
       specific depth remaining (larger reduction with more depth left)  
       and the total moves searched at this node (again a larger         
       reduction as more moves are searched.)  This can easily be        
       changed with the new "lmr" command (help lmr for details).  The   
       old "noise" stuff has been changed.  Now, rather than noise       
       specifying the number of nodes that have to be searched before    
       any output is displayed (PV, fail high, etc) it now specifies a   
       time that must pass before the first output is produced.  This    
       can still be displayed with noise=0, otherwise noise=n says no    
       output until n seconds have passed (n can be a fraction of a      
       second if wanted.)  I restored the old adaptive null-move code,   
       but changed the reduction (used to be 2 to 3 depending on the     
       depth remaining) so that it is 3 + depth / 6.  For each 6 plies   
       of remaining depth, we reduce by another ply, with no real bound  
       other than the max depth we can reasonably reach.  The "6" in     
       the above can be changed with the personality command or the new  
       null m n command.  When the changes to noise were made, Iterate() 
       was also changed so that if the search terminates for any reason  
       and no PV was displayed, whether because the noise limit was not  
       satisfied or this search did not produce a PV because it started  
       at an excessively deep search depth, Crafty now always displays   
       at least one PV in the log file to show how the game is going to  
       continue.  One case where this was an issue was where Crafty did  
       a LONG ponder search because the opponent took an excessive       
       amount of time to make a move.  If it finishes (say) a 30 ply     
       search while pondering, when the opponent moves, Crafty will move 
       instantly, but when it starts the next search, we start pondering 
       at iteration 29, which might take a long time to finish.  But we  
       do still have the PV from the last search (the 30 ply one) and    
       main() thoughtfully removed the first two moves (the move played  
       and the move we are now pondering, so we always have something to 
       display, and now we no longer terminate a search with no output   
       (PV, time, etc) at all.  Generation II of the parallel split code 
       (see Thread() in the thread.c source file comments for specifics) 
       that is a lighter-weight split/copy plus the individual threads   
       do most of the copying rather than the thread that initiates the  
       split.  More work on the skill command.  For each 10 units the    
       skill level is reduced, the search speed is reduced by 50%.       
       Additionally the null-move and LMR reduction values are also      
       ramped down so that by the time skill drops below 10, there are   
       no reductions left.                                               
                                                                         
24.2   Bug in SetTime() fixed.  The absolute time limit was set way too  
       large in sudden death games (games with no secondary time control 
       to add time on the clock) with an increment.  It was intended to  
       allow the search to use up to 5x the normal target time, unless   
       that was larger than 1/2 of the total time remaining, but some-   
       where along the way that 5x was removed and it always set the     
       absolute time limit to 1/2 the remaining clock, which would burn  
       clock time way too quickly.  Complete re-factor of Search() to    
       take the main loop and move that to a new procedure               
       SearchMoveList() to eliminate the duplicated search code between  
       Search() and SearchParallel() which has been replaced by          
       SearchMoveList().  New addition to NextMove().  It was cleaned up 
       to eliminate the old NextEvasion() code since one was a sub-set   
       of the other.  A minor change to limit the NPS impact on history  
       ordering.  We do not do history ordering within 6 plies of a      
       terminal node, and we only try to order 1/2 of the moves before   
       giving up and assuming this is an ALL node where ordering does    
       not matter.                                                       
                                                                         
25.0   Complete re-factor of pawn evaluation code.  There were too many  
       overlapping terms that made tuning difficult.  Now a pawn is      
       classified as one specific class, there is no overlap between     
       classes, which simplifies the code significantly.  The code is    
       now easier to understand and modify.  In addition, the passed     
       pawn evaluation was rewritten and consolidates all the passed     
       pawn evaluation in one place.  The evaluation used to add a bonus 
       for rooks behind passed pawns in rook scoring, blockading some-   
       where else, etc.  All of this was moved to the passed pawn code   
       to make it easier to understand and modify.  A limited version of 
       late move pruning (LMP) is used in the last two plies.  Once a    
       set number of moves have been searched with no fail high, non-    
       interesting moves are simply skipped in a way similar to futility 
       pruning.  This version also contains a major rewrite of the       
       parallel search code, now referred to as Generation II.  It has a 
       more lightweight split algorithm, that costs the parent MUCH less 
       effort to split the search.  The key is that now the individual   
       "helper" threads do all the work, allocating a split block,       
       copying the data from the parent, etc., rather than the parent    
       doing it all.  Gen II based on the DTS "late-join" idea so that a 
       thread will try to join existing split points before it goes to   
       the idle wait loop waiting for some other thread to split with    
       it.  In fact, this is now the basis for the new split method      
       where the parent simply creates a split point by allocating a     
       split block for itself, and then continuing the search, after the 
       parent split block is marked as "joinable".  Now any idle threads 
       just "jump in" without the parent doing anything else, which      
       means that currently idle threads will "join" this split block if 
       they are in the "wait-for-work" spin loop, and threads that be-   
       come idle also join in exactly the same way.  This is MUCH more   
       efficient, and also significantly reduces the number of split     
       blocks needed during the search.  We also now pre-split the       
       search when we are reasonably close to the root of the tree,      
       which is called a "gratuitous split.  This leaves joinable split  
       points laying around so that whenever a thread becomes idle, it   
       can join in at these pre-existing split points immediately.  We   
       now use a much more conservative approach when dealing with fail  
       highs at the root.  Since LMR and such have introduced greater    
       levels of search instability, we no longer trust a fail-high at   
       the root if it fails low on the re-search.  We maintain the last  
       score returned for every root move, along with the PV.  Either an 
       exact score or the bound score that was returned.  At the end of  
       the iteration, we sort the root move list using the backed-up     
       score as the sort key, and we play the move with the best score.  
       This solves a particularly ugly issue where we get a score for    
       the first move, then another move fails high, but then fails low  
       and the re-search produces a score that is actually WORSE than    
       the original best move.  We still see that, but we always play    
       the best move now.  One other efficiency trick is that when the   
       above happens, the search would tend to be less efficient since   
       the best score for that fail-high/fail-low move is actually worse 
       than the best move/score found so far.  If this happens, the      
       score is restored to the original best move score (in Search())   
       so that we continue searching with a good lower bound, not having 
       to deal with moves that would fail high with this worse value,    
       but not with the original best move's value.  Minor change to     
       history counters that now rely on a "saturating counter" idea.  I 
       wanted to avoid the aging idea, and it seemed to not be so clear  
       that preferring history moves by the depth at which they were     
       good was the way to go.  I returned to a history counter idea I   
       tested around 2005 but discarded, namely using a saturating       
       counter.  The idea is that a center value (at present 1024)       
       represents a zero score.  Decrementing it makes it worse,         
       incrementing it makes it better.  But to make it saturate at      
       either end, I only reduce the counter by a fraction of its        
       distance from the saturation point so that once it gets to either 
       extreme value, it will not be modified further avoiding wrap-     
       around.  This basic idea was originally reported by Mark Winands  
       in 2005.  It seems to provide better results (slightly) on very   
       deep searches.  One impetus for this was an intent to fold this   
       into a move so that I could sort the moves rather than doing the  
       selection approach I currently use.  However, this had a bad      
       effect on testing, since history information is dynamic and is    
       constantly changing, between two moves at the same ply in fact.   
       The sort fixed the history counters to the value at the start of  
       that ply.  This was discarded after testing, but the history      
       counter based on the saturating counter idea seemed to be OK and  
       was left in even though it produced minimal Elo gain during       
       testing.  Change to the way moves are counted, to add a little    
       more consistency to LMR.  Now Next*() returns an order number     
       that starts with 1 and monotonically increases, this order number 
       is used for LMR and such decisions that vary things based on how  
       far down the move list something occurs.  Root move counting was  
       particularly problematic with parallel searching, now things are  
       at least "more consistent".  The only negative impact is that now 
       the move counter gets incremented even for illegal moves, but     
       testing showed this was a no-change change with one thread, and   
       the consistency with multiple threads made it useful.  New method 
       to automatically tune SMP parameters.  The command is autotune    
       and "help autotune" will explain how to run it.  Added the        
       "counter-move" heuristic for move ordering (Jos Uiterwijk, JICCA) 
       which simply remembers a fail high move and the move at the       
       previous ply.  If the hash, captures or killer moves don't result 
       in a fail high, this move is tried next.  No significant cost,    
       seems to reduce tree size noticeably.  Added a follow-up idea     
       based on the same idea, except we pair a move that fails high     
       with the move two plies back, introducing a sort of "connection"  
       between them.  This is a sort of "plan" idea where the first move 
       of the pair enables the second move to fail high.  The benefit is 
       that this introduces yet another pair of good moves that get      
       ordered before history moves, and is therefore not subject to     
       reduction.  I have been unable to come up with a reference for    
       this idea, but I believe I first saw it somewhere around the time 
       Fruit showed up, I am thinking perhaps in the JICCA/JICGA.  Any   
       reference would be appreciated.  Minor change to the way the PV   
       and fail-hi/fail-low moves are displayed when pondering.  Crafty  
       now adds the ponder move to the front of the PV enclosed in ( )   
       so that it is always visible in console mode.  The depths are     
       reaching such extreme numbers the ponder move scrolls off the top 
       of the screen when running in console mode or when "tail -f" is   
       used to watch the log file while a game is in progress.  This is  
       a bit trickier than you might think since Crafty displays the     
       game move numbers in the PV.  Penalty for pawns on same color as  
       bishop now only applies when there is one bishop.                 
                                                                         
25.0.1 Cleanup of NextMove() plus a minor ordering bug fix that would    
       skip counter moves at ply = 2. Added NUMA code to force the hash  
       tables to be spread across the numa nodes as equally as possible  
       rather than all of the data sitting on just onenode.  This makes  
       one specific user policy important.  BEFORE you set the hash size 
       for any of the four hash tables, you should ALWAYS set the max    
       threads limit first, so that the NUMA trick works correctly.  Of  
       course, if you do not use -DAFFINITY this is all irrelevant.  The 
       -DNUMA option has been removed.  I no longer use any libnuma      
       routines.  A new "smpnuma" command is now used to enable/disable  
       NUMA mode (which really only affects how the hash tables are      
       cleared, all the other NUMA code works just fine no matter        
       whether this is enabled or disabled.  Fixed a bug with the xboard 
       memory command that could overflow and cause preposterous malloc  
       requests.                                                         
                                                                         




-- System Information:
Debian Release: stretch/sid
  APT prefers unstable
  APT policy: (990, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 4.4.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Init: systemd (via /run/systemd/system)

Versions of packages crafty depends on:
ii  libc6       2.22-5
ii  libnuma1    2.0.11-1
ii  libstdc++6  5.3.1-13

Versions of packages crafty recommends:
pn  crafty-books-medtosmall | crafty-books-small  <none>
ii  xboard                                        4.8.0-2

crafty suggests no packages.

-- no debconf information

Bug#820514: crafty: Please package latest upstream 25.0.1

Reply via email to