Wes, Sounds good, will pull you in if needed.
Kristian - Sure. We were thinking of doing the individual OS jars as snapshot deployments too but this might be better. Will work on getting a PR ready for the same. On Mon, Oct 15, 2018 at 5:44 PM Wes McKinney <wesmck...@gmail.com> wrote: > I am not familiar with snapshot deployment processes for Java packages. > Since there are so many Apache projects that are Java-based, we should look > at what others are doing. If you need help with anything that requires PMC > karma I can try to help. > > On Mon, Oct 15, 2018, 7:36 AM Krisztián Szűcs <szucs.kriszt...@gmail.com> > wrote: > > > Actually We should deploy to github releases as well, see > > > > > https://github.com/apache/arrow/blob/master/dev/tasks/conda-recipes/travis.linux.yml#L59 > > > > Then We could download the jars from github directly, similarly like > > https://github.com/kszucs/crossbow/releases/tag/nightly-73-ubuntu-xenial > > > > Once We have a working build cycle deploying to github We can setup > another > > deployment to maven (however I can't mange maven permissions) - so the > > maven credential issue shouldn't block the actual artifact building > > process. > > > > On Mon, Oct 15, 2018 at 1:12 PM Praveen Kumar <prav...@dremio.com> > wrote: > > > > > Hey Kristian, > > > > > > Yes you are right, I am planning to use an encrypted variable in > travis. > > > > > > But which token do i use for the deployment? My ossrh account will not > > have > > > deploy permissions in apache/arrow maven repository, so was wondering > > which > > > token to use? Once this is clarified, i will raise the PR that creates > > the > > > jar and deploys the same. > > > > > > Would you prefer to discuss the token too on the PR, if yes i will > raise > > > the PR by tomorrow. Currently i tested on my private repo ( > > > https://github.com/praveenbingo/crossbow) but the actual deploy would > be > > > configured on > > > (https://github.com/dremio/crossbow). > > > > > > Thx. > > > > > > On Mon, Oct 15, 2018 at 4:09 PM Krisztián Szűcs < > > szucs.kriszt...@gmail.com > > > > > > > wrote: > > > > > > > Hi Praveen, > > > > > > > > I assume We're planning to run it on travis, so We need to pass en > > > > encrypted env variable: > > > > > > > > > > > > > > https://docs.travis-ci.com/user/environment-variables/#defining-variables-in-repository-settings > > > > > > > > Have You created the crossbow task for creating the jar? If You > submit > > a > > > PR > > > > We could further > > > > discuss the deployment steps there. > > > > > > > > On Mon, Oct 15, 2018 at 5:53 AM Praveen Kumar <prav...@dremio.com> > > > wrote: > > > > > > > > > Hi Kristian/Wes, > > > > > > > > > > Can you please advise on the deploy tokens. Also do you want to > > include > > > > the > > > > > arrow jars in the snapshot deploy? > > > > > > > > > > Thx. > > > > > > > > > > On Fri, Oct 12, 2018 at 11:50 AM Praveen Kumar <prav...@dremio.com > > > > > > wrote: > > > > > > > > > > > Hi Kristian, > > > > > > > > > > > > Thanks for reviewing. > > > > > > > > > > > > Yup that is our plan too, we are targeting the ubuntu release > > first. > > > We > > > > > > will pick the mac and the combiner as required later. > > > > > > > > > > > > For the frequency of deployments, we would be doing at-least > once a > > > day > > > > > > with the flexibility to manually trigger too. > > > > > > > > > > > > Thx. > > > > > > > > > > > > On Thu, Oct 11, 2018 at 9:41 PM Krisztián Szűcs < > > > > > szucs.kriszt...@gmail.com> > > > > > > wrote: > > > > > > > > > > > >> On Thu, Oct 11, 2018 at 12:58 PM Praveen Kumar < > > prav...@dremio.com> > > > > > >> wrote: > > > > > >> > > > > > >> > Hi All, > > > > > >> > > > > > > >> > I spent some time today understanding cross bow and it looks > > > great! > > > > > >> > > > > > > >> > To unblock ourselves immediately, we are going to do the > ubuntu > > > > deploy > > > > > >> > first, followed by the mac deploy and the fat jar deployment. > > > > > >> > > > > > > >> > To confirm our understanding we would be doing the following > > > > > >> > > > > > > >> > 1. Create a queue repo similar to one here( > > > > > >> > https://github.com/praveenbingo/crossbow) but under dremio > org. > > > > > >> > > > > > > >> Correct, although We might want a centralized crossbow repo to > > > deploy > > > > > >> scheduled (e.g. nightly) packages. > > > > > >> > > > > > >> > 2. Have the repo kick off crossbow builds for each OS that we > > > would > > > > > >> want. > > > > > >> > > > > > > >> Correct. To run the tasks: `python crossbow.py submit > gandiva-osx > > > > > >> gandiva-ubuntu` > > > > > >> It returns the build identifier, e.g. `build-123` > > > > > >> > > > > > >> > 3. In addition to OS builds, there would be another build > which > > > > would > > > > > >> just > > > > > >> > be waiting for the OS builds to finish (with some timeout) and > > > once > > > > > done > > > > > >> > will package the fat jar and deploy to maven. > > > > > >> > > > > > > >> Basically yes, but depending on the build times it might worth > > > > building > > > > > >> the > > > > > >> fat jar > > > > > >> locally instead (of course You can trigger another task which > does > > > the > > > > > >> same > > > > > >> thing > > > > > >> just remotely). Currently the artifact downloading is built in > the > > > > > `sign` > > > > > >> command, > > > > > >> but we can quickly factor that out: `python crossbow.py sign > > > > build-123` > > > > > >> > > > > > >> I'd like to generalize task dependencies, but this is definitely > > the > > > > > >> quickest to start with. > > > > > >> > > > > > >> > > > > > > >> > The only thing that i am unclear of is the maven deploy > tokens. > > > > Since > > > > > i > > > > > >> am > > > > > >> > not a committer with permissions to push to maven repo, I > would > > > need > > > > > >> keys > > > > > >> > to be configured in the dremio/crossbow environment variables. > > > > > >> > > > > > > >> How often do We want to ship fat jars? > > > > > >> > > > > > >> > > > > > > >> > Wes - do Siddharth/Jacques have permissions to push to maven > > repo > > > > and > > > > > >> can i > > > > > >> > use the same? > > > > > >> > > > > > > >> > Also looks like the release scripts here > > > > > >> > < > > > > > > > https://github.com/apache/arrow/blob/master/dev/release/01-perform.sh> > > > > > >> > would need to be changed as well if we want to deploy the fat > > jar > > > as > > > > > >> part > > > > > >> > of releases. > > > > > >> > > > > > > >> Correct. > > > > > >> > > > > > >> > > > > > > >> > Kristian - can you please review the proposed steps and let me > > > know > > > > if > > > > > >> they > > > > > >> > look correct to you? > > > > > >> > > > > > > >> Absolutely! > > > > > >> > > > > > >> BTW if You want to unblock yourself first, then it's enough to > > have > > > a > > > > > >> single task which > > > > > >> builds the ubuntu libs and the fat jar (in a single CI build), > and > > > We > > > > > can > > > > > >> handle the > > > > > >> dependent task (fat jar building) after We introduce another > child > > > > (mac > > > > > or > > > > > >> win). So We > > > > > >> could spare the third step in the first iteration. > > > > > >> > > > > > >> > > > > > > >> > Thx. > > > > > >> > > > > > > >> > > > > > > >> > On Wed, Oct 10, 2018 at 11:33 PM Praveen Kumar < > > > prav...@dremio.com> > > > > > >> wrote: > > > > > >> > > > > > > >> > > Hi Wes, > > > > > >> > > > > > > > >> > > I'll take this to completion. Will send out a proposal > > tomorrow. > > > > > >> > > > > > > > >> > > Thx. > > > > > >> > > > > > > > >> > > On Wed, Oct 10, 2018, 23:32 Wes McKinney < > wesmck...@gmail.com > > > > > > > > wrote: > > > > > >> > > > > > > > >> > >> hi folks, > > > > > >> > >> > > > > > >> > >> How would you like to proceed on this? I'm tracking many > > > projects > > > > > >> > >> right now so I want to make sure someone else is "in > charge" > > on > > > > > this > > > > > >> > >> effort > > > > > >> > >> > > > > > >> > >> Thanks, > > > > > >> > >> Wes > > > > > >> > >> On Sat, Oct 6, 2018 at 10:37 AM Wes McKinney < > > > > wesmck...@gmail.com> > > > > > >> > wrote: > > > > > >> > >> > > > > > > >> > >> > > We could create a worker pool like abstraction where > the > > > > > workers > > > > > >> are > > > > > >> > >> the CI services, but that would require a scheduler to poll > > the > > > > > >> finished > > > > > >> > >> jobs then submit the dependent ones. This sounds a bit > > > > > inconvenient, > > > > > >> > where > > > > > >> > >> would that scheduler run: locally, on a CI or self hosted? > > > > > >> > >> > > > > > > >> > >> > Inevitably we're going to need to build some kind of job > > > > > scheduler, > > > > > >> > >> > whether it uses Airflow or Luigi or some other tool of > our > > > own > > > > > >> > >> > devising. > > > > > >> > >> > > > > > > >> > >> > Apache Arrow is eventually going to need a host where we > > can > > > > > manage > > > > > >> > >> > such workflows. I'm looking into the possibility of a > > > physical > > > > > >> > >> > CUDA-equipped host that could be made available to Arrow > > > > > >> developers to > > > > > >> > >> > use for testing and benchmarking. I may need to run the > > > machine > > > > > >> out of > > > > > >> > >> > my home (we did something similar for pandas -- physical > > > > machine > > > > > >> that > > > > > >> > >> > we can SSH into). > > > > > >> > >> > > > > > > >> > >> > All this idealism aside -- we take the shortest path > > possible > > > > for > > > > > >> this > > > > > >> > >> > particular packaging job, and make improvements as we can > > > going > > > > > >> > >> > forward. > > > > > >> > >> > On Sat, Oct 6, 2018 at 9:31 AM Krisztián Szűcs > > > > > >> > >> > <szucs.kriszt...@gmail.com> wrote: > > > > > >> > >> > > > > > > > >> > >> > > I see now, so the jar would contain all of the three > > shared > > > > > >> > libraries. > > > > > >> > >> > > > > > > > >> > >> > > We could create a worker pool like abstraction where > the > > > > > workers > > > > > >> are > > > > > >> > >> the > > > > > >> > >> > > CI services, but that would require a scheduler to poll > > the > > > > > >> finished > > > > > >> > >> jobs > > > > > >> > >> > > then > > > > > >> > >> > > submit the dependent ones. This sounds a bit > > inconvenient, > > > > > where > > > > > >> > would > > > > > >> > >> > > that scheduler run: locally, on a CI or self hosted? > > > > > >> > >> > > > > > > > >> > >> > > Another approach would be to use the worker the > schedule > > > the > > > > > next > > > > > >> > >> task, > > > > > >> > >> > > in a similar fashion like dask's worker_client [1] > > launches > > > > > tasks > > > > > >> > from > > > > > >> > >> > > tasks. > > > > > >> > >> > > There could be synchronization problems though. This > > > approach > > > > > >> > requires > > > > > >> > >> > > to bootstrap crossbow on each CI jobs but that would: > > > > > >> > >> > > - make crossbow less CI dependent (to use azure > pipelines > > > as > > > > > >> well) > > > > > >> > >> > > - unify the artifact uploading and downloading logic > > which > > > is > > > > > >> > >> required in > > > > > >> > >> > > order > > > > > >> > >> > > to support dependent tasks > > > > > >> > >> > > - way less redundancy in task definitions > > > > > >> > >> > > > > > > > >> > >> > > What do You think? I'd prefer the second one. > > > > > >> > >> > > > > > > > >> > >> > > [1] > > > > > >> > >> > > > > > > > >> > >> > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > https://github.com/dask/distributed/blob/master/docs/source/task-launch.rst > > > > > >> > >> > > > > > > > >> > >> > > On Sat, Oct 6, 2018 at 10:57 AM Wes McKinney < > > > > > >> wesmck...@gmail.com> > > > > > >> > >> wrote: > > > > > >> > >> > > > > > > > >> > >> > > > It seems the complicated part of this will be having > a > > > > > >> dependent > > > > > >> > >> task > > > > > >> > >> > > > that packages up the 3 shared libraries, one for each > > > > > platform, > > > > > >> > >> after > > > > > >> > >> > > > the individual packaging tasks are run. How would you > > > > propose > > > > > >> > >> handling > > > > > >> > >> > > > that? > > > > > >> > >> > > > On Fri, Oct 5, 2018 at 8:03 AM Krisztián Szűcs > > > > > >> > >> > > > <szucs.kriszt...@gmail.com> wrote: > > > > > >> > >> > > > > > > > > > >> > >> > > > > Ohh, just read the thread, sorry! > > > > > >> > >> > > > > > > > > > >> > >> > > > > So crossbow is located here > > > > > >> > >> > > > > https://github.com/apache/arrow/tree/master/dev/tasks > > > > > >> > >> > > > > I suggest to "fork" the python-wheels directory > which > > > > > >> contains > > > > > >> > >> three > > > > > >> > >> > > > templated ymls > > > > > >> > >> > > > > for osx, win and linux builds. For building on > linux > > > > > >> something > > > > > >> > >> like the > > > > > >> > >> > > > following should > > > > > >> > >> > > > > be sufficient > > > > > >> > >> > > > > > > > > >> https://gist.github.com/kszucs/39154876d60c4109ff59b678afd65b19 > > > > > >> > >> > > > > Then You need another entry in the tasks.yml, for > > > > example: > > > > > >> > >> > > > > jar-gandiva-linux: > > > > > >> > >> > > > > platform: linux > > > > > >> > >> > > > > template: gandiva-jars/travis.linux.yml > > > > > >> > >> > > > > params: > > > > > >> > >> > > > > # arbitrary params which are available from the > > > templated > > > > > yml > > > > > >> > >> > > > > ... > > > > > >> > >> > > > > artifacts: > > > > > >> > >> > > > > # these are the expected artifacts from the build > > > > > >> > >> > > > > - gandiva-SNAPSHOT-{version}.jar > > > > > >> > >> > > > > ... > > > > > >> > >> > > > > > > > > > >> > >> > > > > Of course crossbow is wired towards the current > > > packaging > > > > > >> > >> requirements, > > > > > >> > >> > > > so likely > > > > > >> > >> > > > > We need to adjust it to the newly appearing > > > requirements. > > > > > >> > >> > > > > > > > > > >> > >> > > > > Feel free to reach me on gitter @kszucs. > > > > > >> > >> > > > > On Oct 4 2018, at 2:02 pm, Wes McKinney < > > > > > wesmck...@gmail.com > > > > > >> > > > > > > >> > >> wrote: > > > > > >> > >> > > > > > > > > > > >> > >> > > > > > hi Praveen, > > > > > >> > >> > > > > > Probably the best way to accomplish this is to > use > > > our > > > > > new > > > > > >> > >> Crossbow > > > > > >> > >> > > > > > infrastructure for task automation on Travis CI > and > > > > > >> Appveyor > > > > > >> > >> rather > > > > > >> > >> > > > > > than trying to do all of this within the CI > > entries. > > > > This > > > > > >> is > > > > > >> > >> how we > > > > > >> > >> > > > > > are producing all of our binary artifacts for > > > releases > > > > > now > > > > > >> -- > > > > > >> > >> > > > > > presumably in future ASF releases, we will want > to > > > > > include > > > > > >> a > > > > > >> > >> > > > > > platform-independent Gandiva JAR in our release > > > votes, > > > > so > > > > > >> this > > > > > >> > >> all > > > > > >> > >> > > > > > needs to end up in Crossbow anyway. The intent is > > for > > > > the > > > > > >> > >> Crossbow > > > > > >> > >> > > > > > system to take on responsibility for all > packaging > > > > > >> automation > > > > > >> > >> rather > > > > > >> > >> > > > > > than using the normal CI for that. > > > > > >> > >> > > > > > > > > > > >> > >> > > > > > Krisztian, do you have time to help Praveen and > the > > > > > Gandiva > > > > > >> > >> crew with > > > > > >> > >> > > > > > this project? This will be an important test to > > > > document > > > > > >> and > > > > > >> > >> improve > > > > > >> > >> > > > > > Crossbow for such use cases > > > > > >> > >> > > > > > > > > > > >> > >> > > > > > Thanks > > > > > >> > >> > > > > > Wes > > > > > >> > >> > > > > > On Thu, Oct 4, 2018 at 7:14 AM Praveen Kumar < > > > > > >> > >> prav...@dremio.com> > > > > > >> > >> > > > wrote: > > > > > >> > >> > > > > > > > > > > > >> > >> > > > > > > Hi Folks, > > > > > >> > >> > > > > > > As part of > > > > > >> https://issues.apache.org/jira/browse/ARROW-3385 > > > > > >> > , > > > > > >> > >> we are > > > > > >> > >> > > > > > > planning to perform a snapshot release of the > > > Gandiva > > > > > >> Jar on > > > > > >> > >> each > > > > > >> > >> > > > commit to > > > > > >> > >> > > > > > > master. This would be a platform independent > jar > > > that > > > > > >> > >> contains the > > > > > >> > >> > > > core > > > > > >> > >> > > > > > > gandiva library and its jni bridge packaged for > > > Mac, > > > > > >> Windows > > > > > >> > >> and *nix > > > > > >> > >> > > > > > > platforms. > > > > > >> > >> > > > > > > > > > > > >> > >> > > > > > > The current plan is to deploy separate snapshot > > > jars > > > > > for > > > > > >> > each > > > > > >> > >> OS > > > > > >> > >> > > > through > > > > > >> > >> > > > > > > entries in the Gandiva CI matrix and then have > a > > > > > combine > > > > > >> > step > > > > > >> > >> that > > > > > >> > >> > > > pulls in > > > > > >> > >> > > > > > > each OS specific jar and builds a jar that has > > all > > > > the > > > > > >> > native > > > > > >> > >> > > > libraries. > > > > > >> > >> > > > > > > This build/deploy would happen only for commits > > on > > > > > master > > > > > >> > >> branch and > > > > > >> > >> > > > not > > > > > >> > >> > > > > > > for PR requests > > > > > >> > >> > > > > > > > > > > > >> > >> > > > > > > Does the plan sound ok (or) please let us know > if > > > > there > > > > > >> is a > > > > > >> > >> better > > > > > >> > >> > > > way to > > > > > >> > >> > > > > > > achieve the same. > > > > > >> > >> > > > > > > > > > > > >> > >> > > > > > > If it sounds ok, can someone please help with > the > > > > > >> following > > > > > >> > >> > > > > > > 1. It looks like we only do travis builds and > not > > > > > >> appveyor > > > > > >> > for > > > > > >> > >> > > > master in > > > > > >> > >> > > > > > > arrow. Any reason for this? > > > > > >> > >> > > > > > > 2. Even if we did appveyor is there a way to > > > sequence > > > > > the > > > > > >> > >> builds. > > > > > >> > >> > > > Like wait > > > > > >> > >> > > > > > > for appveyor to complete before kicking off > > travis? > > > > > >> Since we > > > > > >> > >> would > > > > > >> > >> > > > need the > > > > > >> > >> > > > > > > dll to be pre-built. > > > > > >> > >> > > > > > > 3. Someone would need to configure the > > credentials > > > to > > > > > use > > > > > >> > for > > > > > >> > >> the > > > > > >> > >> > > > ossrh > > > > > >> > >> > > > > > > deployment. The credentials would need access > to > > > > deploy > > > > > >> to > > > > > >> > >> > > > org.apache.arrow. > > > > > >> > >> > > > > > > > > > > > >> > >> > > > > > > Thanks ahead! > > > > > >> > >> > > > > > > > > >> > >> > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > >