(The suggestion here is to use Tensorflow with Spark - definitely doable for a long time with things like Horovod. Spark handles the image processing just fine)
On Thu, Oct 14, 2021 at 10:17 AM Artemis User <arte...@dtechspace.com> wrote: > Spark is good with SQL type of structured data, not image data. Unless > you algorithms don' t require dealing with image data directly. I guess > your best option would be to go with Tensorflow since it has image > classification models built-in and can integrate with NVidia GPUs out of > the box. There is no out-of-the-box data source APIs for image data in > Spark. Hope this helps. > > -- ND > > On 10/13/21 11:54 PM, 刘沛文 wrote: > > Hi, > My name is Peiwen. I'm working with Dr. Brain, an AI company focused on > medical imaging processing and deep learning. Our website is > http://drbrain.net/index_en.aspx > We basically do 2 major things. 1. image process, like lesion drawing 2. > deep learning for neural disease prediction, like stroke, Alzheimer's > Disease. > Currently we use Tensorflow and other deep learning frameworks. Due to the > size of the medical image (1 ~ 5 GB per record), with traditional framework > on single computer, it takes long time (a few hours) for data processing > and model training before we get the result. > I'm writing the email to check if there's some good solution that Apache > Spark can provide to accelerate the calculation. > I know Tensorflow can work with Spark. Just want to have a brief > understanding that compared to traditional Tensorflow, how faster Apache > Spark can help achieve, saying a cluster of 10 nodes. > > Thank you very much! > > Peiwen > > >