The cuda implementation is currently rotting in algorithm/A51/implementation/cuda/kernel/bitslice.hpp
the file you mentioned is not used anywhere. but it is a bug indeed. its not the only one in the file. the cuda code is currently not functional in any file in the repository. >I recently adopted the bitslice approach in my own software and found >a useless instruction generated by a macro that affects the >performance of the CUDA implementation. > >See: /algorithm/A51/implementation/common/partitioned_bitslice.hpp > >Revision 79, Line: 115 > >BOOST_PP_REPEAT(23, pbs_clock_r3,); > >22 repetitions are enough: > >BOOST_PP_REPEAT(22, pbs_clock_r3,); > >Luckily it does not affect the correctness. Due to the fact that: "If >the difference between x and y is less than 0, the result is saturated >to 0." no error was produced. Instead just an unnecessary r3_0 >assignment that does not make any sense: > >r3_0 = r3_0 & not_clock_r3 | r3_0 & do_clock_r3; > >If the compiler would recognize that not_clock_r3 and do_clock_r3 are >complements maybe this command would be omitted. >_______________________________________________ >A51 mailing list >A51@lists.reflextor.com >http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51 ___________________________________________________________ GRATIS für alle WEB.DE Nutzer: Die maxdome Movie-FLAT! Jetzt freischalten unter http://movieflat.web.de _______________________________________________ A51 mailing list A51@lists.reflextor.com http://lists.lists.reflextor.com/cgi-bin/mailman/listinfo/a51