There is a large overhead to distributing this type of workload. I imagine that for a small problem, the overhead dominates. You do not nearly need to distribute a problem of this size, so more workers is probalby just worse.
On Sun, Apr 30, 2023 at 1:46 AM second_co...@yahoo.com < second_co...@yahoo.com> wrote: > I re-test with cifar10 example and below is the result . can advice why > lesser num_slot is faster compared with more slots? > > num_slots=20 > > 231 seconds > > > num_slots=5 > > 52 seconds > > > num_slot=1 > > 34 seconds > > the code is at below > https://gist.github.com/cometta/240bbc549155e22f80f6ba670c9a2e32 > > Do you have an example of tensorflow+big dataset that I can test? > > > > > > > > On Saturday, April 29, 2023 at 08:44:04 PM GMT+8, Sean Owen < > sro...@gmail.com> wrote: > > > You don't want to use CPUs with Tensorflow. > If it's not scaling, you may have a problem that is far too small to > distribute. > > On Sat, Apr 29, 2023 at 7:30 AM second_co...@yahoo.com.INVALID > <second_co...@yahoo.com.invalid> wrote: > > Anyone successfully run native tensorflow on Spark ? i tested example at > https://github.com/tensorflow/ecosystem/tree/master/spark/spark-tensorflow-distributor > on Kubernetes CPU . By running in on multiple workers CPUs. I do not see > any speed up in training time by setting number of slot from1 to 10. The > time taken to train is still the same. Anyone tested tensorflow training on > Spark distributed workers with CPUs ? Can share your working example? > > > > > >