Also http://aws.amazon.com/publicdatasets/.
On Fri, Mar 12, 2010 at 11:59 PM, Ian Holsman <i...@holsman.net> wrote: > There are several large data sets on the net you could use to build. Demo > with. > Search logs, wikipedia, uk govt stuff > Dbpedia may be interesting as they have some of the stuff extracted out > > > --- > Sent from my phone > Ian Holsman - 703 879-3128 > > On 13/03/2010, at 4:46 PM, Jonathan Ellis <jbel...@gmail.com> wrote: > >> On Fri, Mar 12, 2010 at 1:55 PM, Krishna Sankar <ksanka...@gmail.com> >> wrote: >>> >>> I was looking at this from CASSANDRA-873 as well as hands-on homework (!) >>> for my OSCON tutorial. Have couple of questions. Would appreciate >>> insights: >>> >>> A) Cassandra-873 suggests Luenandra as one demo application >>> B) Are there other ideas that will bring out the various aspects of >>> Cassandra ? >> >> multi-user blog (single-user is too easy :) >> - extra credit: with full-text search using lucandra >> >> discussion forum >> - also w/ FTS >> >>> C) What would be the goal of demo apps ? Tutorial to help folks learn >>> the >>> ins and outs of Cassandra ? Show case capabilities ? I think >>> Cassandra-873 >>> belongs to the latter; Twissandra most probably belongs to the former. >> >> I think you nailed it. >> >>> D) Hadoop on Cassandra might be a good demo/tutorial >> >> Sure, I'll buy that. >> >> I can't think of any standalone projects for that, but "compute a >> twissandra tag cloud" would be pretty cool. (Might need to write a >> twissandra bot to load stuff in to make an interesting cloud. :) >> >>> E) How would one structure the infrastructure for the demo/tutorials ? >>> What >>> assumptions can we make in creating them ? As AMIs to be run in EC2 ? >> >> I'd probably go with "virtualbox images" as being simpler for people >> who don't have an AWS key already. (VB can read vmware player images, >> i think. But there is no free vmware for OS X, so you'd want to check >> that before going w/ vmware format.) >> >> Or just have people d/l cassandra and a configuration xml. Probably >> easier than teaching people to use virtualbox who haven't before. >> >>> Also >>> to be run on 2-3 local machines for folks who can spare some ? Or as >>> multiple processes - all in one machine ? >> >> You're not going to have time to teach cluster management. Keep it to 1. >