Hi Terry, >From my limited experience, I'd say you have enough to get started. I've set up a small cloud with just 6 nodes on AWS: One namenode/tasktracker/Cloudbase (Accumulo when it was first released) machine, one zookeeper, and 4 datanode/jobtracker/tabletserver nodes. (Yes, I believe you should be able to run the Accumulo Master on the Hadoop namenode.)
The cloud was set up to test out running things on AWS, so I didn't do anything terribly data intensive on it. The worst issue I had was that MapReduce jobs needed more than a gig of memory, so early on I had to switch from medium size machines (with 4 gigs of ram) to large instances (8 gigs of ram). Thoughts: You should have enough to get started. If you don't know where your limits are, you'll find them and then you can work to address them. Recommendations: If and when you're ready to optimize your project, consider how your data is stored in Accumulo. NoSQL is new enough that I don't think the community has all the answers for particular use cases. Cheers! James On Tue, Apr 16, 2013 at 8:07 PM, Terry P. <[email protected]> wrote: > Greetings everyone, > I'm learning a lot from reading all of the great questions and informative > answers here on the Accumulo mailing list. Thus far I haven't come across > a question similar to mine, nor a basic recommendation so here goes: > > I'm looking for recommendations on process / component placement for a > small Accumulo cluster serving a prototype. It will be scaled later, but > for now I'm looking at a cluster with just 8 nodes. My current thought > process has led me to the following server / process placement and I'm > interested in feedback on it. > > zoo1, zoo2, zoo3: ZooKeeper servers, dual proc, 4 GB RAM (small servers) > > namenode, secnamenode: 16GB RAM, 4 cores each, with local and remote > locations to store name data > *** Can I place the Accumulo Master on the NameNode or Secondary NameNode? > *** > > accdata1, accdata2, accdata3: 16GB RAM, 4 cores each, serving as HDFS > DataNodes and Accumulo TabletServers each with 4 2TB JBOD disks for HDFS > > I'm thinking having the Accumulo Master on the NameNode will simplify > cluster startup. Thoughts? Recommendations? > > Many thanks in advance, > Terry >
