Early planning for Pulp 3.0 is building up some steam, and it's a good time to go over the proposed technology stack that we're looking at right now that we're looking at to build on. For all of these choices, once Pulp's basic needs are met, the major deciding factor for what library to use is decided by "meta" factors, like community support, release processes, etc. Special thanks to Jeff Ortel for making sure my assumptions about these tools got challenged so the right choices get made.
We're using postgres as the DB for 3.0. Since we're going relational, the next thing we'd want is a good ORM. Several team members have experience with the Django ORM, and Pulp is actually already using it in its views. It has a fantastic community, is well documented, and comes with a vast multitude of third-party plugins to help us fill in any gaps in functionality that may be found. Our current tasking system is build on Celery[0], which is among those third-party plugins with excellent Django support, which potentially means that using Django with a relational DB can help us get rid of code where we overlap functionality that may be provided by django-celery. Other ORM options were considered, but only SQLAlchemy (another very good ORM) stood out as something we could use if there was a compelling reason to switch from Django, but at this time there is no such reason. Django does the job well. Most other ORMs are either not robust enough in their feature-set or apparently not being actively maintained, and were rejected as alternatives. Also rejected outright was not using an ORM (or other form of data mapper) at all, since my sense is that we all agree that we don't want to manually be writing SQL. :) This leads to the next big building block, which is the tool we should use to build our REST APIs. I've used django-tastypie in the past, as have a few other team members, and it was my front- runner for this job. After looking around though, it looks like django-rest-framework (DRF) is currently dominating this space in the Django community[0]. Going through some of their tutorials and examples, it's looking like tastypie is out of the running, and DRF is the winner. Both would be adequate for Pulp's needs when it comes to putting a REST API on top of our data model, so it makes sense to go with the more "popular" option. In addition, I think its documentation and API are easier to work with than tastypie's, so it's simultaneously easier to use and easier to *learn how* to use. Finally, we're looking at bringing in a search engine for the search views in the API. We're currently doing search using mongodb, using mongo-specific search criteria, but will be decoupling the search API from the search engine. As with Django, a few team members have experience using elasticsearch (myself included). Elasticsearch is java-based, running on top of the Lucene indexer, with a simple REST API on top of it, and so at the moment it's my preferred search engine. I looked at a few other search engines in recent testing, including the pure-python engine "Whoosh", Solr (also uses lucene), Xapian, and Sphinx (the search engine, not the document builder). Of these, only Whoosh and Elasticsearch have first-party support by the django-haystack project[2], which is both my preferred and the most commonly used django search plugin[3]. Given my previous positive experience with Elasticsearch, I think it's probably the best choice for a search indexer at this time. The Whoosh plugin for Haystack currently doesn't support a very useful feature that Whoosh itself does support, which is faceting. This feature gap is something that would need to be closed (likely by us) to get feature parity between the elasticsearch and whoosh backends. While there are other libraries that appear to live in the same space as haystack (integrate a search indexer with Django models, providing Django QuerySet/Model results), none of them have the robust features and community support seen in haystack. Again, though, decoupling the search interface from the search implementation means that this piece is likely to be easy to change out if we find better options in the future (especially if we write it with this in mind). Summary: - Django ORM on postgres - django-rest-Framework to build API views - django-haystack to provide search capabilities, using Elasticsearch to start, possible switching to Whoosh after some development -- this switch should occur before any release of 3.0 [0]: http://docs.celeryproject.org/en/latest/django/ [1]: https://www.djangopackages.com/grids/g/rest/ [2]: http://django-haystack.readthedocs.io/en/stable/backend_support.html [3]: https://www.djangopackages.com/grids/g/search/
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pulp-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/pulp-list
