Steven - send me please your contact info (email for now, preferably Apache
if you have one or Dremio) to ebegoliATutkDOTedu.

Thank you,
Edmon

On Fri, Sep 25, 2015 at 12:18 PM, Jacques Nadeau <[email protected]> wrote:

> That is great news! From the Dremio side, I propose working with Steven.
> Let's start taking advantage of this awesome resource!
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Wed, Sep 23, 2015 at 5:34 PM, Edmon Begoli <[email protected]> wrote:
>
> > This request has been approved. I will get more details tomorrow.
> >
> > I could add to the resource few members of the Drill team, maybe one
> person
> > from MapR and one from Dremio
> > who can have access and can assist in configuring (or instructing
> resource
> > sysadmins) how to run the big tests, if desired.
> > They will need to apply and get RSA tokens.
> >
> > Then we can talk how to make this resource a part of the regular testing
> > and benchmarking process.
> >
> > Thank you,
> > Edmon
> >
> > On Fri, Sep 18, 2015 at 8:00 PM, Edmon Begoli <[email protected]> wrote:
> >
> > > I requested 5000 hours a year on Beacon for Apache Drill for high
> > > performance benchmarking, testing and optimization.
> > > I will let you know of the resolution pretty soon. I expect these
> > > resources to be awarded to the project.
> > >
> > >
> > > On Fri, Sep 18, 2015 at 6:22 PM, Parth Chandra <[email protected]>
> > > wrote:
> > >
> > >> +1 on running the build and tests.
> > >> If we need to run some kind of stress tests, we could consider running
> > >> TPC-H/TPC-DS at large scale factors.
> > >>
> > >> On Fri, Sep 18, 2015 at 2:24 PM, Jacques Nadeau <[email protected]>
> > >> wrote:
> > >>
> > >> > Not offhand. It really depends on how the time would work. For
> > example,
> > >> it
> > >> > would be nice if we had an automated perfectly fressh (no .m2/repo)
> > >> nightly
> > >> > build and full test suite run so people can always check the status.
> > >> Maybe
> > >> > we use this hardware for that?
> > >> >
> > >> > --
> > >> > Jacques Nadeau
> > >> > CTO and Co-Founder, Dremio
> > >> >
> > >> > On Fri, Sep 18, 2015 at 9:48 AM, rahul challapalli <
> > >> > [email protected]> wrote:
> > >> >
> > >> > > Edmon,
> > >> > >
> > >> > > We do have the tests available now [1].
> > >> > >
> > >> > > Jacques,
> > >> > >
> > >> > > You expressed interest in making these tests available on an
> Amazon
> > >> > cluster
> > >> > > so that users need not have physical hardware required to run
> these
> > >> > tests.
> > >> > > Do you have any specific thoughts on how to leverage the resources
> > >> that
> > >> > > Edmon is willing to contribute (performance testing?)
> > >> > >
> > >> > >
> > >> > > [1] https://github.com/mapr/drill-test-framework
> > >> > >
> > >> > > - Rahul
> > >> > >
> > >> > > On Thu, Sep 17, 2015 at 8:49 PM, Edmon Begoli <[email protected]>
> > >> wrote:
> > >> > >
> > >> > > > I discussed this idea of bringing large compute resource
> yesterday
> > >> with
> > >> > > my
> > >> > > > team at JICS to the project, and there was a general consensus
> > that
> > >> it
> > >> > > can
> > >> > > > be committed.
> > >> > > >
> > >> > > > I will request and hopefully commit pretty large set of
> > >> > > > clustered CPU/storage resources for the needs of a Drill
> project.
> > >> > > >
> > >> > > > I will be the PI for the resource, and could give access to
> > >> whomever we
> > >> > > > want to designate from the Drill project side.
> > >> > > >
> > >> > > > Just let me know. I should have project approved within few
> days.
> > >> > > >
> > >> > > > Edmon
> > >> > > >
> > >> > > >
> > >> > > > On Saturday, September 5, 2015, Edmon Begoli <[email protected]
> >
> > >> > wrote:
> > >> > > >
> > >> > > > > Ted,
> > >> > > > >
> > >> > > > > It is actually very easy and painless to do what I am
> > proposing. I
> > >> > > > > probably made it sound far more bureaucratic/legalistic than
> it
> > >> > really
> > >> > > > is.
> > >> > > > >
> > >> > > > > Researchers and projects from across the globe can apply for
> > >> cycles
> > >> > on
> > >> > > > > Beacon or any other HPC platform we run. (Beacon is by far the
> > >> best
> > >> > and
> > >> > > > we
> > >> > > > > already have a setup to run Spark and Hive on it. (We just
> > >> published
> > >> > > > paper
> > >> > > > > about it at XSEDE on integrating PBS/TORQUE scheduler with
> Spark
> > >> to
> > >> > run
> > >> > > > > JVM-bound jobs))
> > >> > > > >
> > >> > > > > As for use of resources, at the end of year we need to submit
> > >> reports
> > >> > > for
> > >> > > > > all the projects that used compute resources and how.
> > >> > > > > It is part of our mission, as being one of the XSEDE centers,
> to
> > >> > > > > help promote the advancement of the science and technology.
> > >> > > > > Reports from Principal Investigators (PI) show how we did it.
> In
> > >> this
> > >> > > > > case, I can be a PI and have any/someone from the Drill team
> > >> assigned
> > >> > > > > access.
> > >> > > > >
> > >> > > > > I don't think there are any IP issues. Open source project,
> open
> > >> > > research
> > >> > > > > institution, use of resources for testing and benchmarking. We
> > >> could
> > >> > > > > actually make JICS a benchmarking site for Drill (and even
> other
> > >> > Apache
> > >> > > > > projects).
> > >> > > > >
> > >> > > > > We'll discuss other details in a hangout. I am also planning
> to
> > >> brief
> > >> > > my
> > >> > > > > team next Wednesday on the plan for the use of resources.
> > >> > > > >
> > >> > > > > Regards,
> > >> > > > > Edmon
> > >> > > > >
> > >> > > > >
> > >> > > > > On Saturday, September 5, 2015, Ted Dunning <
> > >> [email protected]
> > >> > > > > <javascript:_e(%7B%7D,'cvml','[email protected]');>>
> wrote:
> > >> > > > >
> > >> > > > >> Edmon,
> > >> > > > >>
> > >> > > > >> This is very interesting.  I am sure that public
> > >> acknowledgements of
> > >> > > > >> contributions are easily managed.
> > >> > > > >>
> > >> > > > >> What might be even more useful for you would be small scale
> > >> > > > publications,
> > >> > > > >> especially about the problems of shoe-horning real-world data
> > >> > objects
> > >> > > > into
> > >> > > > >> the quasi-relational model of Drill.
> > >> > > > >>
> > >> > > > >> What would be problematic (and what is probably just a matter
> > of
> > >> > > > >> nomenclature) is naming of an institution by the Apache
> > specific
> > >> > term
> > >> > > > >> "committer" (you said commitment). Individuals at your
> > >> institution
> > >> > > would
> > >> > > > >> absolutely be up for being committers as they demonstrate a
> > track
> > >> > > record
> > >> > > > >> of
> > >> > > > >> contribution.
> > >> > > > >>
> > >> > > > >> I would expect no need for any paperwork between JICS and
> > Apache
> > >> > > unless
> > >> > > > >> you
> > >> > > > >> would like to execute a corporate contributor license to
> ensure
> > >> that
> > >> > > > >> particular individuals are specifically empowered to
> contribute
> > >> > code.
> > >> > > I
> > >> > > > >> don't know that the position of JICS is relative to
> > intellectual
> > >> > > > property,
> > >> > > > >> though, so it might be worth checking out institutional
> policy
> > on
> > >> > your
> > >> > > > >> side
> > >> > > > >> on how individuals can contribute to open source projects. It
> > >> > > shouldn't
> > >> > > > be
> > >> > > > >> too hard since there are quite a number of NSF funded people
> > who
> > >> do
> > >> > > > >> contribute.
> > >> > > > >>
> > >> > > > >>
> > >> > > > >>
> > >> > > > >>
> > >> > > > >>
> > >> > > > >> On Fri, Sep 4, 2015 at 9:39 PM, Edmon Begoli <
> > [email protected]>
> > >> > > wrote:
> > >> > > > >>
> > >> > > > >> > I can work with my institution and the NSF that we committ
> > the
> > >> > time
> > >> > > on
> > >> > > > >> the
> > >> > > > >> > Beacon supercomputing cluster to Apache and the Drill
> > project.
> > >> > Maybe
> > >> > > > 20
> > >> > > > >> > hours a month for 4-5 nodes.
> > >> > > > >> >
> > >> > > > >> > I have discretionary hours that I can put in, and I can,
> with
> > >> our
> > >> > > > >> > HPC admins, create deploy scripts on few clustered machines
> > >> (these
> > >> > > are
> > >> > > > >> all
> > >> > > > >> > very large boxes with 16 cores, 256 GB, 40gb IB
> interconnect,
> > >> and
> > >> > > > >> > with local 1 TB SSD each). There is also Medusa 10 PB
> > >> filesystem
> > >> > > > >> attached
> > >> > > > >> > but HDFS over local drives would probably be better.
> > >> > > > >> > They are otherwise just a regular machines, and run regular
> > >> JVMs
> > >> > on
> > >> > > > >> Linux.
> > >> > > > >> >
> > >> > > > >> > We can also get Rahul an access with a secure token to
> setup
> > >> > > > >> > and run stress/performance/integration tests for Drill. I
> can
> > >> > > actually
> > >> > > > >> help
> > >> > > > >> > there as well. This can be automated to run tests and
> collect
> > >> > > results.
> > >> > > > >> >
> > >> > > > >> > I think that the only requirement would be that the JICS
> team
> > >> be
> > >> > > named
> > >> > > > >> for
> > >> > > > >> > commitment because both NSF/XSEDE and UT like to see the
> > >> resources
> > >> > > > >> > being officially used and acknowledged. They are there to
> > >> support
> > >> > > open
> > >> > > > >> and
> > >> > > > >> > academic research; open source projects fit well.
> > >> > > > >> >
> > >> > > > >> > If this sounds OK with the project PMCs, I can start the
> > >> process
> > >> > of
> > >> > > > >> > allocation, accounts creation, setup.
> > >> > > > >> >
> > >> > > > >> > I would also, as a CDO, of JICS sign whatever standard
> papers
> > >> with
> > >> > > > >> > the Apache organization.
> > >> > > > >> >
> > >> > > > >> > With all this being said, let me know please if this is
> > >> something
> > >> > we
> > >> > > > >> want
> > >> > > > >> > to pursue.
> > >> > > > >> >
> > >> > > > >> > Thank you,
> > >> > > > >> > Edmon
> > >> > > > >> >
> > >> > > > >> > On Tuesday, September 1, 2015, Jacques Nadeau <
> > >> [email protected]
> > >> > >
> > >> > > > >> wrote:
> > >> > > > >> >
> > >> > > > >> > > I spent a bunch of time looking at the Phi coprocessors
> and
> > >> > forgot
> > >> > > > to
> > >> > > > >> get
> > >> > > > >> > > back to the thread. I'd love it if someone spent some
> time
> > >> > looking
> > >> > > > at
> > >> > > > >> > > leveraging them (since Drill is frequently processor
> > bound).
> > >> > Any
> > >> > > > >> takers?
> > >> > > > >> > >
> > >> > > > >> > >
> > >> > > > >> > >
> > >> > > > >> > > --
> > >> > > > >> > > Jacques Nadeau
> > >> > > > >> > > CTO and Co-Founder, Dremio
> > >> > > > >> > >
> > >> > > > >> > > On Mon, Aug 31, 2015 at 10:24 PM, Parth Chandra <
> > >> > > [email protected]
> > >> > > > >> > > <javascript:;>> wrote:
> > >> > > > >> > >
> > >> > > > >> > > > Hi Edmon,
> > >> > > > >> > > >   Sorry no one seems to have got back to you on this.
> > >> > > > >> > > >   We are in the process of publishing a test suite for
> > >> > > regression
> > >> > > > >> > testing
> > >> > > > >> > > > Drill and the cluster you have (even a few nodes )
> would
> > >> be a
> > >> > > > great
> > >> > > > >> > > > resource for folks to run the test suite. Rahul, et al
> > are
> > >> > > working
> > >> > > > >> on
> > >> > > > >> > > this
> > >> > > > >> > > > and I would suggest watching out for Rahul's posts on
> the
> > >> > topic.
> > >> > > > >> > > >
> > >> > > > >> > > > Parth
> > >> > > > >> > > >
> > >> > > > >> > > > On Tue, Aug 25, 2015 at 9:55 PM, Edmon Begoli <
> > >> > > [email protected]
> > >> > > > >> > > <javascript:;>> wrote:
> > >> > > > >> > > >
> > >> > > > >> > > > > Hey folks,
> > >> > > > >> > > > >
> > >> > > > >> > > > > As we discussed today on a hangout, this is a machine
> > >> that
> > >> > we
> > >> > > > >> have at
> > >> > > > >> > > > > JICS/NICS
> > >> > > > >> > > > > where I have Drill installed and where I could set
> up a
> > >> test
> > >> > > > >> cluster
> > >> > > > >> > > over
> > >> > > > >> > > > > few nodes.
> > >> > > > >> > > > >
> > >> > > > >> > > > >
> > >> > > > >> > >
> > >> > > > >>
> > >> > >
> > >>
> https://www.nics.tennessee.edu/computing-resources/beacon/configuration
> > >> > > > >> > > > >
> > >> > > > >> > > > > Note that each node is:
> > >> > > > >> > > > > - 2x8-core Intel® Xeon® E5-2670 processors
> > >> > > > >> > > > > - 256 GB of memory
> > >> > > > >> > > > > - 4 Intel® Xeon Phi™ coprocessors 5110P with 8 GB of
> > >> memory
> > >> > > each
> > >> > > > >> > > > > - 960 GB of SSD storage
> > >> > > > >> > > > >
> > >> > > > >> > > > > Would someone advise on what would be an interesting
> > test
> > >> > > setup?
> > >> > > > >> > > > >
> > >> > > > >> > > > > Thank you,
> > >> > > > >> > > > > Edmon
> > >> > > > >> > > > >
> > >> > > > >> > > >
> > >> > > > >> > >
> > >> > > > >> >
> > >> > > > >>
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Reply via email to