Ted, It is actually very easy and painless to do what I am proposing. I probably made it sound far more bureaucratic/legalistic than it really is.
Researchers and projects from across the globe can apply for cycles on Beacon or any other HPC platform we run. (Beacon is by far the best and we already have a setup to run Spark and Hive on it. (We just published paper about it at XSEDE on integrating PBS/TORQUE scheduler with Spark to run JVM-bound jobs)) As for use of resources, at the end of year we need to submit reports for all the projects that used compute resources and how. It is part of our mission, as being one of the XSEDE centers, to help promote the advancement of the science and technology. Reports from Principal Investigators (PI) show how we did it. In this case, I can be a PI and have any/someone from the Drill team assigned access. I don't think there are any IP issues. Open source project, open research institution, use of resources for testing and benchmarking. We could actually make JICS a benchmarking site for Drill (and even other Apache projects). We'll discuss other details in a hangout. I am also planning to brief my team next Wednesday on the plan for the use of resources. Regards, Edmon On Saturday, September 5, 2015, Ted Dunning <[email protected]> wrote: > Edmon, > > This is very interesting. I am sure that public acknowledgements of > contributions are easily managed. > > What might be even more useful for you would be small scale publications, > especially about the problems of shoe-horning real-world data objects into > the quasi-relational model of Drill. > > What would be problematic (and what is probably just a matter of > nomenclature) is naming of an institution by the Apache specific term > "committer" (you said commitment). Individuals at your institution would > absolutely be up for being committers as they demonstrate a track record of > contribution. > > I would expect no need for any paperwork between JICS and Apache unless you > would like to execute a corporate contributor license to ensure that > particular individuals are specifically empowered to contribute code. I > don't know that the position of JICS is relative to intellectual property, > though, so it might be worth checking out institutional policy on your side > on how individuals can contribute to open source projects. It shouldn't be > too hard since there are quite a number of NSF funded people who do > contribute. > > > > > > On Fri, Sep 4, 2015 at 9:39 PM, Edmon Begoli <[email protected] > <javascript:;>> wrote: > > > I can work with my institution and the NSF that we committ the time on > the > > Beacon supercomputing cluster to Apache and the Drill project. Maybe 20 > > hours a month for 4-5 nodes. > > > > I have discretionary hours that I can put in, and I can, with our > > HPC admins, create deploy scripts on few clustered machines (these are > all > > very large boxes with 16 cores, 256 GB, 40gb IB interconnect, and > > with local 1 TB SSD each). There is also Medusa 10 PB filesystem attached > > but HDFS over local drives would probably be better. > > They are otherwise just a regular machines, and run regular JVMs on > Linux. > > > > We can also get Rahul an access with a secure token to setup > > and run stress/performance/integration tests for Drill. I can actually > help > > there as well. This can be automated to run tests and collect results. > > > > I think that the only requirement would be that the JICS team be named > for > > commitment because both NSF/XSEDE and UT like to see the resources > > being officially used and acknowledged. They are there to support open > and > > academic research; open source projects fit well. > > > > If this sounds OK with the project PMCs, I can start the process of > > allocation, accounts creation, setup. > > > > I would also, as a CDO, of JICS sign whatever standard papers with > > the Apache organization. > > > > With all this being said, let me know please if this is something we want > > to pursue. > > > > Thank you, > > Edmon > > > > On Tuesday, September 1, 2015, Jacques Nadeau <[email protected] > <javascript:;>> wrote: > > > > > I spent a bunch of time looking at the Phi coprocessors and forgot to > get > > > back to the thread. I'd love it if someone spent some time looking at > > > leveraging them (since Drill is frequently processor bound). Any > takers? > > > > > > > > > > > > -- > > > Jacques Nadeau > > > CTO and Co-Founder, Dremio > > > > > > On Mon, Aug 31, 2015 at 10:24 PM, Parth Chandra <[email protected] > <javascript:;> > > > <javascript:;>> wrote: > > > > > > > Hi Edmon, > > > > Sorry no one seems to have got back to you on this. > > > > We are in the process of publishing a test suite for regression > > testing > > > > Drill and the cluster you have (even a few nodes ) would be a great > > > > resource for folks to run the test suite. Rahul, et al are working on > > > this > > > > and I would suggest watching out for Rahul's posts on the topic. > > > > > > > > Parth > > > > > > > > On Tue, Aug 25, 2015 at 9:55 PM, Edmon Begoli <[email protected] > <javascript:;> > > > <javascript:;>> wrote: > > > > > > > > > Hey folks, > > > > > > > > > > As we discussed today on a hangout, this is a machine that we have > at > > > > > JICS/NICS > > > > > where I have Drill installed and where I could set up a test > cluster > > > over > > > > > few nodes. > > > > > > > > > > > > > > https://www.nics.tennessee.edu/computing-resources/beacon/configuration > > > > > > > > > > Note that each node is: > > > > > - 2x8-core Intel® Xeon® E5-2670 processors > > > > > - 256 GB of memory > > > > > - 4 Intel® Xeon Phi™ coprocessors 5110P with 8 GB of memory each > > > > > - 960 GB of SSD storage > > > > > > > > > > Would someone advise on what would be an interesting test setup? > > > > > > > > > > Thank you, > > > > > Edmon > > > > > > > > > > > > > > >
