Øystein Johansen wrote:
> Jonathan Kinsey wrote:
>> Can someone explain how the NN works in terms of the NNEVAL_SAVE and
>> NNEVAL_FROMBASE operations?
> 
> It's a smart trick invented by Joseph. Notice that there in the original
> implementation (before the sse vectorization) was a "if (input_node)"
> and if not it just moved a pointer. This condition test saved a lot of
> time when the input node was 0 (and it often is). An evaluation with
> lots of zeros in the input uses considerably shorter processing time
> than a input with non-zeros.
> 
> So Joseph, ingenious as he is, invented the trick of storing the input
> array and the resulting hidden layer array, when the first candidate
> move is considered in a movelist. When evaluating the next move
> candidate, we simply subtract the new input array from the stored input
> array, and in that way we're hoping to "generate" a lot of zeros to the
> evaluator. (As described above, zeros are faster!) Now, to compensate at
> the hidden layer, the result from the modified input array evaluation is
> subtracted from the stored hidden layer array, to get the right result.
> The evaluation step from the hidden layer to the output layer is unchanged.
> 
> When I first saw and understood this clever trick, I was mighty
> impressed. The speedup is not superb, but significant. About 10% I
> guess. Joseph may have more accurate results. It's not as big speed
> improvement as the pruning nets, but it is (or was) worth having.
> 
> The SSE vectorization doesn't do this trick. 

Are you sure - the code looks the same in both places to me.

> Of course it then has to
> check if all four inputs are zeros to just move the result pointer to
> the next position, without doing any arithmetics. Maybe it's worth doing
>  something similar for the vectorized code? If not, we may remove the
> cleaver trick from the SSE code, so it's not wasting cycles subtracting
> the input array for no use.
> 
> NNEVAL_SAVE is then a flag to the evaluator to do a normal evaluation
> and save the input array and the hidden layer array.
> 
> NNEVAL_FROMBASE is using a flag to evaluate from a saved set of inputs
> and saved set of hidden layer values.

I think I need to have separate copies or savedIBase and savedBase for
each thread (as well as for each NN).  It's quite tied up with the
nContext array so I might try to join them together in some way...

Jon

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Bug-gnubg mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/bug-gnubg

Reply via email to