Answers inline below. On Sunday, April 19, 2015 at 4:04:15 AM UTC-5, tomerneeraj wrote: > > Hi, > > We would like to use spark without Hadoop. To use it in highly scalable > and high availability mode, yarn and hdfs Api do the purpose of resource > scheduling and shared storage. We have data stored in separate disk(not > shared). Couple of queries regarding this > > 1. Can we replace YARN with Akka cluster for resource scheduling(master > and worker node work distribution )?? >
Akka cluster doesn't have the resource management capabilities nor integration with Spark that are required. We at Typesafe are considering implementing this capability. For now, your best alternatives to YARN are Mesos, for which we are offering production support, and standalone mode, where you manually configure a cluster yourself. Mesos is best for general-purpose, multi-job and multi-use clustering, while standalone is fine if you have just a few jobs running, like a continuous streaming job with its own, dedicated hardware. > 2. Is it necessary to have shared file system for spark streaming. Can we > have standalone disk for master and worker in spark streaming and resource > scheduling without sharing any disk between spark nodes?? > It's necessary to have shared filesystem. It could be NFS, but you'll have poor I/O performance. Fortunately, running HDFS without the rest of Hadoop is not difficult. It might be possible to use other distributed filesystems like Ceph, but I haven't tried that. > 3. What is the algorithm to distribute traffic by master node to worker > node and how does spark streaming scale. Is there any way AKKA cluster > helping it somehow?? > Spark does a good job partitioning data, even incoming streams, across the cluster. When reading from a distributed file system it knows about (i.e., HDFS and S3), it can read and process blocks in parallel. Akka messaging is used for some internal communications, but Spark isn't "deeply" dependent on Akka. Akka would be an excellent foundation for a big data system. At Typesafe, we're thinking about how to make use of it for different use cases ;) > > Regards > Neeraj > -- >>>>>>>>>> Read the docs: http://akka.io/docs/ >>>>>>>>>> Check the FAQ: >>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
