Hi, We would like to use spark without Hadoop. To use it in highly scalable and high availability mode, yarn and hdfs Api do the purpose of resource scheduling and shared storage. We have data stored in separate disk(not shared). Couple of queries regarding this
1. Can we replace YARN with Akka cluster for resource scheduling(master and worker node work distribution )?? 2. Is it necessary to have shared file system for spark streaming. Can we have standalone disk for master and worker in spark streaming and resource scheduling without sharing any disk between spark nodes?? 3. What is the algorithm to distribute traffic by master node to worker node and how does spark streaming scale. Is there any way AKKA cluster helping it somehow?? Regards Neeraj -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
