Thanks Jacques. I'm very happy to get involved and share my experiences. I'm looking for the best way to set up a cluster now. In terms of evaluating Drill's performance, do you think it's especially important to have a system that would be close in performance to a production cluster, or would it be worthwhile exploring it on a small scale? Problem being a student, my budget is limited, so I'm exploring things like Raspberry Pi clusters, which I think don't have linear performance improvements as you scale out. I'm also enquiring about EC2 or GCE student licensing.
On 29 August 2013 05:08, Jacques Nadeau <[email protected]> wrote: > A Hadoop cluster would be a good start. We're in the process right now of > putting together distributable files which will help get you to up to speed > quickly. Contribution isn't just code, there are many types and I'm sure > you can help in any number of ways. Just documenting your early > experiences and advice would be a great way to start helping out. > > Jacques > > > On Sun, Aug 25, 2013 at 1:25 PM, Tom Seddon <[email protected]> > wrote: > > > Hi, > > > > I'm looking to do a dissertation on Drill, as part of masters degree in > > Data Science. I'm hoping to set up a cluster to run it and then analyse > > its efficiency with different datasets, as well as make recommendations > for > > its usage. I know Drill is in a fairly early stage of development but I > > have around 18 months until the project is due, so I'm hoping the timing > > will work as Drill is developed further. > > > > I'd be grateful for any advice on how I could get started on this. > Would a > > Hadoop cluster be a good back-end to base my project on or would > something > > more suited to nested data like MongoDB be more appropriate? Also, I > > haven't found much documentation on configuring Drill in a distributed > > environment, so any help on this would be appreciated. > > > > I'd also be willing to contribute but not sure if I have enough Java > > experience. My background is mainly in BI and database technologies. > > > > Thanks, > > > > Tom > > >
