This request has been approved. I will get more details tomorrow.

I could add to the resource few members of the Drill team, maybe one person
from MapR and one from Dremio
who can have access and can assist in configuring (or instructing resource
sysadmins) how to run the big tests, if desired.
They will need to apply and get RSA tokens.

Then we can talk how to make this resource a part of the regular testing
and benchmarking process.

Thank you,
Edmon

On Fri, Sep 18, 2015 at 8:00 PM, Edmon Begoli <[email protected]> wrote:

> I requested 5000 hours a year on Beacon for Apache Drill for high
> performance benchmarking, testing and optimization.
> I will let you know of the resolution pretty soon. I expect these
> resources to be awarded to the project.
>
>
> On Fri, Sep 18, 2015 at 6:22 PM, Parth Chandra <[email protected]>
> wrote:
>
>> +1 on running the build and tests.
>> If we need to run some kind of stress tests, we could consider running
>> TPC-H/TPC-DS at large scale factors.
>>
>> On Fri, Sep 18, 2015 at 2:24 PM, Jacques Nadeau <[email protected]>
>> wrote:
>>
>> > Not offhand. It really depends on how the time would work. For example,
>> it
>> > would be nice if we had an automated perfectly fressh (no .m2/repo)
>> nightly
>> > build and full test suite run so people can always check the status.
>> Maybe
>> > we use this hardware for that?
>> >
>> > --
>> > Jacques Nadeau
>> > CTO and Co-Founder, Dremio
>> >
>> > On Fri, Sep 18, 2015 at 9:48 AM, rahul challapalli <
>> > [email protected]> wrote:
>> >
>> > > Edmon,
>> > >
>> > > We do have the tests available now [1].
>> > >
>> > > Jacques,
>> > >
>> > > You expressed interest in making these tests available on an Amazon
>> > cluster
>> > > so that users need not have physical hardware required to run these
>> > tests.
>> > > Do you have any specific thoughts on how to leverage the resources
>> that
>> > > Edmon is willing to contribute (performance testing?)
>> > >
>> > >
>> > > [1] https://github.com/mapr/drill-test-framework
>> > >
>> > > - Rahul
>> > >
>> > > On Thu, Sep 17, 2015 at 8:49 PM, Edmon Begoli <[email protected]>
>> wrote:
>> > >
>> > > > I discussed this idea of bringing large compute resource yesterday
>> with
>> > > my
>> > > > team at JICS to the project, and there was a general consensus that
>> it
>> > > can
>> > > > be committed.
>> > > >
>> > > > I will request and hopefully commit pretty large set of
>> > > > clustered CPU/storage resources for the needs of a Drill project.
>> > > >
>> > > > I will be the PI for the resource, and could give access to
>> whomever we
>> > > > want to designate from the Drill project side.
>> > > >
>> > > > Just let me know. I should have project approved within few days.
>> > > >
>> > > > Edmon
>> > > >
>> > > >
>> > > > On Saturday, September 5, 2015, Edmon Begoli <[email protected]>
>> > wrote:
>> > > >
>> > > > > Ted,
>> > > > >
>> > > > > It is actually very easy and painless to do what I am proposing. I
>> > > > > probably made it sound far more bureaucratic/legalistic than it
>> > really
>> > > > is.
>> > > > >
>> > > > > Researchers and projects from across the globe can apply for
>> cycles
>> > on
>> > > > > Beacon or any other HPC platform we run. (Beacon is by far the
>> best
>> > and
>> > > > we
>> > > > > already have a setup to run Spark and Hive on it. (We just
>> published
>> > > > paper
>> > > > > about it at XSEDE on integrating PBS/TORQUE scheduler with Spark
>> to
>> > run
>> > > > > JVM-bound jobs))
>> > > > >
>> > > > > As for use of resources, at the end of year we need to submit
>> reports
>> > > for
>> > > > > all the projects that used compute resources and how.
>> > > > > It is part of our mission, as being one of the XSEDE centers, to
>> > > > > help promote the advancement of the science and technology.
>> > > > > Reports from Principal Investigators (PI) show how we did it. In
>> this
>> > > > > case, I can be a PI and have any/someone from the Drill team
>> assigned
>> > > > > access.
>> > > > >
>> > > > > I don't think there are any IP issues. Open source project, open
>> > > research
>> > > > > institution, use of resources for testing and benchmarking. We
>> could
>> > > > > actually make JICS a benchmarking site for Drill (and even other
>> > Apache
>> > > > > projects).
>> > > > >
>> > > > > We'll discuss other details in a hangout. I am also planning to
>> brief
>> > > my
>> > > > > team next Wednesday on the plan for the use of resources.
>> > > > >
>> > > > > Regards,
>> > > > > Edmon
>> > > > >
>> > > > >
>> > > > > On Saturday, September 5, 2015, Ted Dunning <
>> [email protected]
>> > > > > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
>> > > > >
>> > > > >> Edmon,
>> > > > >>
>> > > > >> This is very interesting.  I am sure that public
>> acknowledgements of
>> > > > >> contributions are easily managed.
>> > > > >>
>> > > > >> What might be even more useful for you would be small scale
>> > > > publications,
>> > > > >> especially about the problems of shoe-horning real-world data
>> > objects
>> > > > into
>> > > > >> the quasi-relational model of Drill.
>> > > > >>
>> > > > >> What would be problematic (and what is probably just a matter of
>> > > > >> nomenclature) is naming of an institution by the Apache specific
>> > term
>> > > > >> "committer" (you said commitment). Individuals at your
>> institution
>> > > would
>> > > > >> absolutely be up for being committers as they demonstrate a track
>> > > record
>> > > > >> of
>> > > > >> contribution.
>> > > > >>
>> > > > >> I would expect no need for any paperwork between JICS and Apache
>> > > unless
>> > > > >> you
>> > > > >> would like to execute a corporate contributor license to ensure
>> that
>> > > > >> particular individuals are specifically empowered to contribute
>> > code.
>> > > I
>> > > > >> don't know that the position of JICS is relative to intellectual
>> > > > property,
>> > > > >> though, so it might be worth checking out institutional policy on
>> > your
>> > > > >> side
>> > > > >> on how individuals can contribute to open source projects. It
>> > > shouldn't
>> > > > be
>> > > > >> too hard since there are quite a number of NSF funded people who
>> do
>> > > > >> contribute.
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >> On Fri, Sep 4, 2015 at 9:39 PM, Edmon Begoli <[email protected]>
>> > > wrote:
>> > > > >>
>> > > > >> > I can work with my institution and the NSF that we committ the
>> > time
>> > > on
>> > > > >> the
>> > > > >> > Beacon supercomputing cluster to Apache and the Drill project.
>> > Maybe
>> > > > 20
>> > > > >> > hours a month for 4-5 nodes.
>> > > > >> >
>> > > > >> > I have discretionary hours that I can put in, and I can, with
>> our
>> > > > >> > HPC admins, create deploy scripts on few clustered machines
>> (these
>> > > are
>> > > > >> all
>> > > > >> > very large boxes with 16 cores, 256 GB, 40gb IB interconnect,
>> and
>> > > > >> > with local 1 TB SSD each). There is also Medusa 10 PB
>> filesystem
>> > > > >> attached
>> > > > >> > but HDFS over local drives would probably be better.
>> > > > >> > They are otherwise just a regular machines, and run regular
>> JVMs
>> > on
>> > > > >> Linux.
>> > > > >> >
>> > > > >> > We can also get Rahul an access with a secure token to setup
>> > > > >> > and run stress/performance/integration tests for Drill. I can
>> > > actually
>> > > > >> help
>> > > > >> > there as well. This can be automated to run tests and collect
>> > > results.
>> > > > >> >
>> > > > >> > I think that the only requirement would be that the JICS team
>> be
>> > > named
>> > > > >> for
>> > > > >> > commitment because both NSF/XSEDE and UT like to see the
>> resources
>> > > > >> > being officially used and acknowledged. They are there to
>> support
>> > > open
>> > > > >> and
>> > > > >> > academic research; open source projects fit well.
>> > > > >> >
>> > > > >> > If this sounds OK with the project PMCs, I can start the
>> process
>> > of
>> > > > >> > allocation, accounts creation, setup.
>> > > > >> >
>> > > > >> > I would also, as a CDO, of JICS sign whatever standard papers
>> with
>> > > > >> > the Apache organization.
>> > > > >> >
>> > > > >> > With all this being said, let me know please if this is
>> something
>> > we
>> > > > >> want
>> > > > >> > to pursue.
>> > > > >> >
>> > > > >> > Thank you,
>> > > > >> > Edmon
>> > > > >> >
>> > > > >> > On Tuesday, September 1, 2015, Jacques Nadeau <
>> [email protected]
>> > >
>> > > > >> wrote:
>> > > > >> >
>> > > > >> > > I spent a bunch of time looking at the Phi coprocessors and
>> > forgot
>> > > > to
>> > > > >> get
>> > > > >> > > back to the thread. I'd love it if someone spent some time
>> > looking
>> > > > at
>> > > > >> > > leveraging them (since Drill is frequently processor bound).
>> > Any
>> > > > >> takers?
>> > > > >> > >
>> > > > >> > >
>> > > > >> > >
>> > > > >> > > --
>> > > > >> > > Jacques Nadeau
>> > > > >> > > CTO and Co-Founder, Dremio
>> > > > >> > >
>> > > > >> > > On Mon, Aug 31, 2015 at 10:24 PM, Parth Chandra <
>> > > [email protected]
>> > > > >> > > <javascript:;>> wrote:
>> > > > >> > >
>> > > > >> > > > Hi Edmon,
>> > > > >> > > >   Sorry no one seems to have got back to you on this.
>> > > > >> > > >   We are in the process of publishing a test suite for
>> > > regression
>> > > > >> > testing
>> > > > >> > > > Drill and the cluster you have (even a few nodes ) would
>> be a
>> > > > great
>> > > > >> > > > resource for folks to run the test suite. Rahul, et al are
>> > > working
>> > > > >> on
>> > > > >> > > this
>> > > > >> > > > and I would suggest watching out for Rahul's posts on the
>> > topic.
>> > > > >> > > >
>> > > > >> > > > Parth
>> > > > >> > > >
>> > > > >> > > > On Tue, Aug 25, 2015 at 9:55 PM, Edmon Begoli <
>> > > [email protected]
>> > > > >> > > <javascript:;>> wrote:
>> > > > >> > > >
>> > > > >> > > > > Hey folks,
>> > > > >> > > > >
>> > > > >> > > > > As we discussed today on a hangout, this is a machine
>> that
>> > we
>> > > > >> have at
>> > > > >> > > > > JICS/NICS
>> > > > >> > > > > where I have Drill installed and where I could set up a
>> test
>> > > > >> cluster
>> > > > >> > > over
>> > > > >> > > > > few nodes.
>> > > > >> > > > >
>> > > > >> > > > >
>> > > > >> > >
>> > > > >>
>> > >
>> https://www.nics.tennessee.edu/computing-resources/beacon/configuration
>> > > > >> > > > >
>> > > > >> > > > > Note that each node is:
>> > > > >> > > > > - 2x8-core Intel® Xeon® E5-2670 processors
>> > > > >> > > > > - 256 GB of memory
>> > > > >> > > > > - 4 Intel® Xeon Phi™ coprocessors 5110P with 8 GB of
>> memory
>> > > each
>> > > > >> > > > > - 960 GB of SSD storage
>> > > > >> > > > >
>> > > > >> > > > > Would someone advise on what would be an interesting test
>> > > setup?
>> > > > >> > > > >
>> > > > >> > > > > Thank you,
>> > > > >> > > > > Edmon
>> > > > >> > > > >
>> > > > >> > > >
>> > > > >> > >
>> > > > >> >
>> > > > >>
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Reply via email to