On Thu, May 12, 2016 at 10:51 AM, Sean Myers <[email protected]> wrote:
> Early planning for Pulp 3.0 is building up some steam, and it's > a good time to go over the proposed technology stack that we're > looking at right now that we're looking at to build on. For all > of these choices, once Pulp's basic needs are met, the major > deciding factor for what library to use is decided by "meta" > factors, like community support, release processes, etc. Special > thanks to Jeff Ortel for making sure my assumptions about these > tools got challenged so the right choices get made. > > We're using postgres as the DB for 3.0. Since we're going > relational, the next thing we'd want is a good ORM. Several team > members have experience with the Django ORM, and Pulp is actually > already using it in its views. It has a fantastic community, is > well documented, and comes with a vast multitude of third-party > plugins to help us fill in any gaps in functionality that may be > found. Our current tasking system is build on Celery[0], which is > among those third-party plugins with excellent Django support, > which potentially means that using Django with a relational DB > can help us get rid of code where we overlap functionality that > may be provided by django-celery. > > Other ORM options were considered, but only SQLAlchemy (another > very good ORM) stood out as something we could use if there was > a compelling reason to switch from Django, but at this time there > is no such reason. Django does the job well. Most other ORMs are > either not robust enough in their feature-set or apparently not > being actively maintained, and were rejected as alternatives. > Also rejected outright was not using an ORM (or other form of > data mapper) at all, since my sense is that we all agree that > we don't want to manually be writing SQL. :) > > This leads to the next big building block, which is the tool we > should use to build our REST APIs. I've used django-tastypie in > the past, as have a few other team members, and it was my front- > runner for this job. After looking around though, it looks like > django-rest-framework (DRF) is currently dominating this space > in the Django community[0]. Going through some of their tutorials > and examples, it's looking like tastypie is out of the running, > and DRF is the winner. Both would be adequate for Pulp's needs > when it comes to putting a REST API on top of our data model, so > it makes sense to go with the more "popular" option. In addition, > I think its documentation and API are easier to work with than > tastypie's, so it's simultaneously easier to use and easier to > *learn how* to use. > > Finally, we're looking at bringing in a search engine for the > search views in the API. We're currently doing search using > mongodb, using mongo-specific search criteria, but will be > decoupling the search API from the search engine. As with Django, > a few team members have experience using elasticsearch (myself > included). Elasticsearch is java-based, running on top of the > Lucene indexer, with a simple REST API on top of it, and so at > the moment it's my preferred search engine. > > I looked at a few other search engines in recent testing, including > the pure-python engine "Whoosh", Solr (also uses lucene), Xapian, > and Sphinx (the search engine, not the document builder). Of these, > only Whoosh and Elasticsearch have first-party support by the > django-haystack project[2], which is both my preferred and the most > commonly used django search plugin[3]. Given my previous positive > experience with Elasticsearch, I think it's probably the best choice > for a search indexer at this time. > Can you expand on why a separate search service is needed and how Postgres won't fill your needs? Thanks, Eric > The Whoosh plugin for Haystack currently doesn't support a very > useful feature that Whoosh itself does support, which is faceting. > This feature gap is something that would need to be closed (likely > by us) to get feature parity between the elasticsearch and whoosh > backends. > > While there are other libraries that appear to live in the same space > as haystack (integrate a search indexer with Django models, providing > Django QuerySet/Model results), none of them have the robust features > and community support seen in haystack. Again, though, decoupling the > search interface from the search implementation means that this piece > is likely to be easy to change out if we find better options in the > future (especially if we write it with this in mind). > > Summary: > - Django ORM on postgres > - django-rest-Framework to build API views > - django-haystack to provide search capabilities, using Elasticsearch > to start, possible switching to Whoosh after some development -- this > switch should occur before any release of 3.0 > > [0]: http://docs.celeryproject.org/en/latest/django/ > [1]: https://www.djangopackages.com/grids/g/rest/ > [2]: http://django-haystack.readthedocs.io/en/stable/backend_support.html > [3]: https://www.djangopackages.com/grids/g/search/ > > > _______________________________________________ > Pulp-list mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/pulp-list >
_______________________________________________ Pulp-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-list
