I propose we continue to work with the existing docker-solr repo for some time 
still, until we fully understand how we want to proceed with moving to ASF 
owned git infra and hub accounts.

I feel that some work should have higher priority for now:
- Document running Solr on Docker in Ref Guide
- Start thinking about how to include Docker image publishing in the release 
process
- Adding a simplistic Dockerfile to our main git repo and a gradle task for 
building
- Update the README in docker-solr repo to reflect the new ownership

Some of these could be sub tasks of SOLR-14168.

Other thoughts?

Jan

> 12. jan. 2020 kl. 04:46 skrev David Smiley <[email protected]>:
> 
> > Yes, it should be easy to build a docker image «from source», or at least 
> > as a gradle build task. That could piggy-back on the distro tgz file which 
> > should make it not too different - we just pull the release from local disk 
> > instead of from the mirrors. 
> 
> We do this at Salesforce in our local Lucene_Solr fork to also produce a 
> docker image.  It's not a big deal but I could share it if we want to 
> consider going this direction.  It's kinda necessary if we want to release 
> this all at once instead of requiring a 'tgz' be released first, which in 
> turn somewhat requires some signatures of that binary that then become 
> irrelevant to check when producing the Docker image.  It's also super nice 
> for those who fork Solr to also produce a Docker image easily (like us).
> 
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley 
> <http://www.linkedin.com/in/davidwsmiley>
> 
> On Sat, Jan 11, 2020 at 5:45 PM Jan Høydahl <[email protected] 
> <mailto:[email protected]>> wrote:
>>> 1. Are we allowed to maintain ASF code in a non-ASF repo? If not, how do we 
>>> transition to
>>> an ASF git repo?
>>>     * Can it be a sub folder in our main repo or does it need to be a 
>>> separate repo?
>> 
>> The way it works (from the official library’s point of view), is that we 
>> maintain 
>> https://github.com/docker-library/official-images/blob/master/library/solr 
>> <https://github.com/docker-library/official-images/blob/master/library/solr> 
>> which contains a link to a repo (in our case 
>> https://github.com/docker-solr/docker-solr.git 
>> <https://github.com/docker-solr/docker-solr.git>) and particular git commit, 
>> and a particular directory for different versions. That is consumed by their 
>> build infrastructure. The library team reviews changes we make to that file, 
>> and the corresponding changes we made to the Dockerfiles and bash scripts in 
>> the docker-solr repo, so it needs to be readily available and it needs to be 
>> easy to see what has changed.
>> 
>> I think one could theoretically move this into the main Solr repo and point 
>> to its GitHub address, but that would make things slower and much harder to 
>> review. So I think it’s much better to keep the separate repo. I briefly 
>> looked for some official guidance on this, but couldn’t find it spelled out 
>> explicitly. I did see 
>> https://github.com/docker-library/official-images#maintainership 
>> <https://github.com/docker-library/official-images#maintainership> which 
>> talks about maintaining git history.
>> Note also that I already use a “docker-solr” GitHub org for the repo, rather 
>> than my own account, to make it easier to vary ownership.
>> 
>> If you are dead-set to put it into the main repo, I’d run that discussion 
>> past the library team first before sinking engineering time.
> 
> I just discovered https://hub.docker.com/u/apache 
> <https://hub.docker.com/u/apache> - which is Apache’s own docker org. I see 
> some images there are hosted in separate apache git repos, example CouchDB: 
> https://github.com/apache/couchdb-docker 
> <https://github.com/apache/couchdb-docker> pushed to 
> https://hub.docker.com/r/apache/couchdb 
> <https://hub.docker.com/r/apache/couchdb> - and 
> https://hub.docker.com/_/couchdb <https://hub.docker.com/_/couchdb> 
> (official). The source of both hub locations seems to be the same 
> apache/couchdb-docker git repo. I see that the person who files PRs aginst 
> the official image repo is Joan Touzet (http://people.apache.org/~wohali/ 
> <http://people.apache.org/~wohali/>) who is a CouchDB committer. Perhpas this 
> is a model for us to follow.
> 
> We may also want to consult LEGAL-503 
> <https://issues.apache.org/jira/browse/LEGAL-503?focusedCommentId=17003438&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17003438>
>  where the Beam project asked a similar question a few weeks ago, and the 
> reply is:
> 
> if you would like to continue linking to the Docker release artifact from the 
> https://beam.apache.org <https://beam.apache.org/> you will have:
> 1. Transition to the official ASF dockerhub org: 
> https://hub.docker.com/u/apache <https://hub.docker.com/u/apache>
> 2. Start including that binary convenience artifact into your VOTE threads on 
> Beam releases
> 3. Make sure that all Cat-X licenses are ONLY brought into your container via 
> FROM statements
> 
> So bullet point #1 there answers this question. Regarding point #2 and #3 see 
> below.
> 
>>> 2. How will the current build/test/publish process need to change?
>>>     * Can we continue using travis for CI?
>> 
>> In the short term, sure.
>> 
>> Travis has been great for us — it is free, it builds fast enough, the UI is 
>> nice, the config is simple, the integration is good, and support was helpful.
>> Last year Travis CI got acquired, followed by layoffs of senior engineering 
>> staff, so there are concerns about its future, but nothing has really 
>> changed to affect us.
>> 
>> I imagine it would be nicer to have it in the normal Apache Jenkins world, 
>> but I’m not volunteering for that migration. :-)
>> 
>> If we want to stay on Travis, there may be some configuration changes 
>> required (roles/permissions/credentials and such that are tied to my 
>> account).
>> 
>> Oh and just to make it clear: the CI does 2 things:
>> - it sets build status on GitHub commits (although there is currently no 
>> enforcement to allow only passing PRs to be merged or things like that, or 
>> have review/automerge workflows which would be nice to have)
>> - and it pushes builds to the 
>> https://hub.docker.com/repository/docker/dockersolr/docker-solr 
>> <https://hub.docker.com/repository/docker/dockersolr/docker-solr> repo — but 
>> those are only used for testing, they are not the docker images that provide 
>> the official images. I've found that occasionally useful, but we could 
>> decide to not do that, or do it differently within the Apache infrastructure.
> 
> So I see other ASF projects using travis as well, perhaps ASF has an 
> account/license? If we continue to use it or if we migrate to Jenkins, we 
> either way need to run the build and test and then push builds to the Apache 
> Docker Hub repository space (making the image pull’able with docker pull 
> apache/solr:tag
> The actual producing of official image will be yet another PR to the docker 
> owned official-images repo.
> 
>>>     * Should publishing of new Docker be a RM responsibility, or something 
>>> that happens right
>>> after each release like the ref-guide?
>> 
>> I don’t have a strong opinion. I typically tried to do it as soon as I 
>> became aware of a new version via the solr-user mailing list or twitter.
>> Sometimes same day, sometimes it would take a week because of changes I need 
>> to make or extra things I wanted to do.
>> But if I’m more than a few days late someone would be asking about it :-)
>> The official library team review is usually very fast, same day or 24h.
> 
> See point #2 from LEGAL-503 above. If we want to officially document / 
> endorse / link to the image on hub we may want to include the docker image in 
> the VOTE. I see that the Beam project includes this in their release-guide 
> (publishing SDK images): https://beam.apache.org/contribute/release-guide/ 
> <https://beam.apache.org/contribute/release-guide/>. What they do is that 
> push a RC tagged version to their docker-hub as part of the release and 
> include it in the VOTE.
> 
>>> 3. Legal stuff - when we as a project file a PR to update the official solr 
>>> docker images,
>>> are we then legally releasing a binary version of Solr?
>>>     Technically it is Docker CI that build and publish the images, we just 
>>> initiate it…
>> 
>> I don’t know about that (or how that matters?)
> 
> Oh, legal stuff matters a lot for Apache :) Again, I think LEGAL-503 answers 
> this. Bullet #3 there requres the project to make sure that our Dockerfile 
> does not bring in Cat-X licensed software into the Docker layers built by us. 
> Since we base our image on the ‘openjdk’ base image, which contains GNU/Linux 
> binaries and the JDK, the only things we'd need to verify is what we bring 
> into our Docker layers through apt-get, wget etc. Below is a list of what I 
> found:
> 
> acl - GPL - provides tool setfacl, used only in tests, can be removed?
> dirmngr, gpg - GPL - used only during docker build phase, may be apt install 
> and uninstalled in the same RUN command
> lsof - BSD license
> procps - GPL - provides the ‘ps’ command needed by bin/solr. This is part of 
> openjdk:11 but not openjdk:11-slim...
> wget - GPL - used during build only, can be uninstalled after use
> netcat - PublicDomain
> gosu - GPL - can be removed or replaced with su-exec (MIT)
> tini - MIT
> 
>>>     Do we know any other ASF project that maintain their own official 
>>> docker image?
>> 
>> I've looked at 
>> https://github.com/docker-library/official-images/tree/master/library 
>> <https://github.com/docker-library/official-images/tree/master/library> and 
>> spotted https://github.com/carlossg/docker-maven 
>> <https://github.com/carlossg/docker-maven> which is maintained by an Apache 
>> committer.
> 
> So couchDB is another example. And there are so many other projects in 
> Apache’s docker-hub org that I suppose there may be others.
> 
>> Marcus wrote:
>> 
>>> I think that regardless of what the community decides to do with the
>>> docker-solr repo, a good first step would be to add a Docker folder to the
>>> Apache repository that contains a base Dockerfile and a README. In that
>>> README, users can be directed to the location of the docker-solr repo,
>>> wherever that may be, or leverage the Dockerfile in the  Apache repo as a
>>> starting point for building their own image.
>> 
>> 
>> I think that could be useful; but it then does start to become messy almost 
>> immediately: Users will expect these self-built images and the official 
>> images to work the same, and given that docker-solr has various extra 
>> scripts (eg to create collections at startup), you’d then have to copy them 
>> into the repo (and now have duplicate maintenance, need to test them). Or 
>> you could explicitly decide not to do that, but then your users will be 
>> asking how to achieve the same functionality with their images.
>> 
>> I would address this as a separate issue. Let’s get the existing image flow 
>> taken care of first.
> 
> Yes, it should be easy to build a docker image «from source», or at least as 
> a gradle build task. That could piggy-back on the distro tgz file which 
> should make it not too different - we just pull the release from local disk 
> instead of from the mirrors. 
> 
> I also saw some projects that have Jenkins routinely publish SNAPSHOT 
> releases to docker-hub, see e.g. https://hub.docker.com/r/apache/syncope/tags 
> <https://hub.docker.com/r/apache/syncope/tags> which is also nice if we want 
> to have people test out things with unreleased versions or master branch, 
> then it is always only a docker run command away :) 
> 
> Well, I hope other committers also join this discussion and bring perhaps 
> other points of view here before we start fleshing out actual JIRA tasks to 
> add to https://issues.apache.org/jira/browse/SOLR-14168 
> <https://issues.apache.org/jira/browse/SOLR-14168>.
> 
> If we end up releasing official Solr Docker images together with the normal 
> release, it would be cool to add documentation to the RefGuide and perhaps 
> tutorial, on how to run Solr with Docker.
> 
> Jan
> 
> 

Reply via email to