jiangxin369 commented on PR #246: URL: https://github.com/apache/flink-ml/pull/246#issuecomment-1624846507
@lindong28 Thanks for the reply. > For those algorithms which have the corresponding benchmark defined in ./flink-ml-benchmark/src/main/resources, can you run the benchmarks and document the performance improvements in the PR description? For those algorithms which are already defined in "./flink-ml-benchmark/src/main/resources", I have run the benchmark for them on my workstation, but most of them remain a similar performance as before. The reason is that the other algorithms are configured with a small dataset(1/10 of Bucketizer) to make it not run so long. The PR is optimized by reducing the cost of creating rows, the effect is positively correlated with dataset size. > Can you confirm that Bucketizer benchmark results are obtained by running `./flink-ml-benchmark/src/main/resources/bucketizer-benchmark.json`? Yes, it is. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
