On Mon, Mar/23/2009 03:53:53PM, Mike Dubman wrote: > I'm playing with google datastore now and will send some proposal and > thoughts. > > On Mon, Mar 23, 2009 at 2:33 PM, Jeff Squyres <jsquy...@cisco.com> wrote: > > Yes, I think you're right -- making a "schema" for the datastore might > be quite easy. *I'm on travel all this week and likely won't be able to > look into this stuff -- can you guys post a proposal and we can dive > into it from that angle? > > On Mar 22, 2009, at 6:48 AM, Mike Dubman wrote: > > Hello guys, > > I`m not sure if we should preserve current DB schema, from one simple > reason - datastore is an object oriented storage and have different > rules and techniques then rdbms. > The basic storage unit in the datastore is an object which can be > saved, loaded and queried. > (hadoop is based on the same principles, but open source.) > > It seems that DB model for mtt over datastore should not be complex at > all. The current mtt db schema is mostly optimized for specific > queries dictated by web UI. Datastore creates indexes automatically, > based on submitted queries history. > > I suggest we discuss/exchange db layout proposals by emails and when > we get to some general understanding how it should look like - we > switch to telepresence. > > Also, It seems not problem at all to get datastore access for existing > gmail account. You get 500MB quota for storage. It takes 5min to start > using it. > > Here is some short info for datastore API: > - howto submit data model to datastore > - howto save, load, query > > > http://code.google.com/appengine/docs/python/gettingstarted/usingdatastore.html > > please comment.
Do we have a monthly cost estimate for this project? We will exceed the free quota of CPU/bandwidth/storage/email, and get billed (depending on how efficient our App is): http://code.google.com/appengine/docs/billing.html The biggest concern would be the Stored Data cost, because I like how we can now archive lots and lots of test results. I do not have permission to access /var/lib/pgsql/data, but weren't we at or near 100 GBs recently? The bandwidth charge would seem to be pretty nominal. We could upload/download 100GB/mo. for just $10/mo. and I am not sure if we approach 100GB. CPU Time is a a mystery number to me. -------------------+---------------------+---------- Resource | Unit | Unit cost -------------------+---------------------+---------- Outgoing Bandwidth | gigabytes | $0.12 Incoming Bandwidth | gigabytes | $0.10 CPU Time | CPU hours | $0.10 Stored Data | gigabytes per month | $0.15 Recipients Emailed | recipients | $0.0001 -------------------+---------------------+---------- Would we itemize the MTT bill on a per user basis? E.g., orgs that use MTT more, would have to pay more? -Ethan > > Thanks > > Mike > > On Fri, Mar 20, 2009 at 5:38 PM, Jeff Squyres <jsquy...@cisco.com> > wrote: > On Mar 20, 2009, at 10:42 AM, Josh Hursey wrote: > > Yeah I think this sounds like a good way to move forward with this > work. The database schema is pretty complex. If you need help on the > database side of things let me know. > > To get started, would it be useful to have a meeting over the phone/ > telepresence to design the datastore layout? This gives us an > opportunity to start from a blank slate with regards to the > datastore, so it may be useful brainstorm a bit beforehand. > > Yes, it probably would. *My understanding of hadoop (which is very > highlevel) is that just dump everything in without too much concern > about the structure / "schema". *But I could be wrong on that. > > The Google Apps account is under my personal Google account, > so I am > reluctant to use it. I think the reason it took so long for me, was > because when I originally signed up it was in limited beta. I think > the approval time is much shorter now (maybe a day?), and we can make > an openmpi or mtt account that we can use. > > With regard to Hadoop, I don't think that IU has a set of machines > that would work, but I can ask around. We could always try Hadoop on > a single machine if people wanted to play around with data querying/ > storage. > > I don't have a strong preference either way, but Google Apps may > provide us with a lower overhead solution for the long run even > though it costs $$. > > It looks like there is a set that you can use for free. *When you go > over one of several metrics (CPU hours/day, storage, bandwidth in, > bandwidth out, etc.), then you have to start paying. *But even with > that, the costs look *quite* reasonable and should be easily covered > by the combined Open MPI organizations (I'm talking hundreds of > dollars here, not tens of thousands). > > -- > Jeff Squyres > Cisco Systems > > _______________________________________________ > mtt-devel mailing list > mtt-de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel > > _______________________________________________ > mtt-devel mailing list > mtt-de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel > > -- > Jeff Squyres > Cisco Systems > > _______________________________________________ > mtt-devel mailing list > mtt-de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel > > References > > Visible links > . mailto:jsquy...@cisco.com > . > http://code.google.com/appengine/docs/python/gettingstarted/usingdatastore.html > . mailto:jsquy...@cisco.com > . mailto:mtt-de...@open-mpi.org > . http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel > . mailto:mtt-de...@open-mpi.org > . http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel > . mailto:mtt-de...@open-mpi.org > . http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel > _______________________________________________ > mtt-devel mailing list > mtt-de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel