Hi folks!

  We started a project to extend Python with a full-blown query language about 
a year ago. The project is call PythonQL, the links are given below in the 
references section. We have implemented what is kind of an alpha version now, 
and gained some experience and insights about why and where this is really 
useful. So I’d like to share those with you and gather some opinions whether 
you think we should try to include these extensions in the Python core.

Intro

  What we have done is (mostly) extended Python’s comprehensions with group by, 
order by, let and window clauses, which can come in any order, thus 
comprehensions become a query language a bit cleaner and more powerful than 
SQL. And we added a couple small convenience extensions, like a  We have 
identified three top motivations for folks to use these extensions:

Our Motivations

1. This can become a standard for running queries against database systems. 
Instead of learning a large number of different SQL dialects (the pain point 
here are libraries of functions and operators that are different for each 
vendor), the Python developer needs only to learn PythonQL and he can query any 
SQL and NoSQL database.

2. A single PythonQL expression can integrate a number of 
databases/files/memory structures seamlessly, with the PythonQL optimizer 
figuring out which pieces of plans to ship to which databases. This is a cool 
virtual database integration story that can be very convenient, especially now, 
when a lot of data scientists use Python to wrangle the data all day long.

3. Querying data structures inside Python with the full power of SQL (and a bit 
more) is also really convenient on its own. Usually folks that are well-versed 
in SQL have to resort to completely different means when they need to run a 
query in Python on top of some data structures.

Current Status

We have PythonQL running, its installed via pip and an encoding hack, that runs 
our preprocessor. We currently compile PythonQL into Python using our executor 
functions and execute Python subexpressions via eval. We don’t do any 
optimization / rewriting of queries into languages of underlying systems. And 
the query processor is basic too, with naive implementations of operators. But 
we’ve build DBMS systems before, so if there is a good amount of support for 
this project, we’ll be able to build a real system here.

Your take on this

Extending Python’s grammar is surely a painful thing for the community. We’re 
now convinced that it is well worth it, because of all the wonderful 
functionality and convenience this extension offers. We’d like to get your 
feedback on this and maybe you’ll suggest some next steps for us.

References

PythonQL GitHub page: https://github.com/pythonql/pythonql 
<https://github.com/pythonql/pythonql>
PythonQL Intro and Tutorial (this is all User Documentation we have right now): 
https://github.com/pythonql/pythonql/wiki/PythonQL-Intro-and-Tutorial 
<https://github.com/pythonql/pythonql/wiki/PythonQL-Intro-and-Tutorial>
A use-case of querying Event Logs and doing Process Mining with PythonQL: 
https://github.com/pythonql/pythonql/wiki/Event-Log-Querying-and-Process-Mining-with-PythonQL
 
<https://github.com/pythonql/pythonql/wiki/Event-Log-Querying-and-Process-Mining-with-PythonQL>
PythonQL demo site: www.pythonql.org <http://www.pythonql.org/>

Best regards,
PythonQL Team




_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to