Øystein Johansen wrote: > Jonathan Kinsey wrote: >> Can someone explain how the NN works in terms of the NNEVAL_SAVE and >> NNEVAL_FROMBASE operations? > > It's a smart trick invented by Joseph. Notice that there in the original > implementation (before the sse vectorization) was a "if (input_node)" > and if not it just moved a pointer. This condition test saved a lot of > time when the input node was 0 (and it often is). An evaluation with > lots of zeros in the input uses considerably shorter processing time > than a input with non-zeros. > > So Joseph, ingenious as he is, invented the trick of storing the input > array and the resulting hidden layer array, when the first candidate > move is considered in a movelist. When evaluating the next move > candidate, we simply subtract the new input array from the stored input > array, and in that way we're hoping to "generate" a lot of zeros to the > evaluator. (As described above, zeros are faster!) Now, to compensate at > the hidden layer, the result from the modified input array evaluation is > subtracted from the stored hidden layer array, to get the right result. > The evaluation step from the hidden layer to the output layer is unchanged. > > When I first saw and understood this clever trick, I was mighty > impressed. The speedup is not superb, but significant. About 10% I > guess. Joseph may have more accurate results. It's not as big speed > improvement as the pruning nets, but it is (or was) worth having. > > The SSE vectorization doesn't do this trick.
Are you sure - the code looks the same in both places to me. > Of course it then has to > check if all four inputs are zeros to just move the result pointer to > the next position, without doing any arithmetics. Maybe it's worth doing > something similar for the vectorized code? If not, we may remove the > cleaver trick from the SSE code, so it's not wasting cycles subtracting > the input array for no use. > > NNEVAL_SAVE is then a flag to the evaluator to do a normal evaluation > and save the input array and the hidden layer array. > > NNEVAL_FROMBASE is using a flag to evaluate from a saved set of inputs > and saved set of hidden layer values. I think I need to have separate copies or savedIBase and savedBase for each thread (as well as for each NN). It's quite tied up with the nContext array so I might try to join them together in some way... Jon
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Bug-gnubg mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-gnubg
