Hi Mark,
> On 25 Mar 2017, at 19:54, Mark E. Haase <meha...@gmail.com> wrote: > > Hi Pavel, > > This is a really impressive body of work. I had looked at this project in the > past but it is great to get back up to speed and see all the progress made. > > I use Python + databases almost every day, and the major unanswered question > is what benefit does dedicated language syntax have over using a DBAL/ORM > with a Builder style API? It obviously has huge costs (as all syntax changes > do) but the benefit is not obvious to me: I have never found myself wanting > built-in syntax for writing database queries. > We do a lot work with data every day doing data science. We have to use tool like pandas, and they don’t work in a lot of cases and in many cases we end up with very cryptic notebooks that only the authors can work with… I actually use PythonQL in daily work and it simplified a lot of things greatly. The general insight is that a small language can add a lot more value than a huge library, because its easy to combine good ideas in a language. If you look at some examples of complex PythonQL, I think you’ll might change your mind: https://github.com/pythonql/pythonql/wiki/Event-Log-Querying-and-Process-Mining-with-PythonQL <https://github.com/pythonql/pythonql/wiki/Event-Log-Querying-and-Process-Mining-with-PythonQL> > My second thought is that every database layer I've ever used was unavoidably > leaky or incomplete. Database functionality (even if we constrain "database" > to mean RDBMS) is too diverse to be completely abstracted away. This is why > so many different abstractions already exist, e.g. low-level like DBAPI and > high-level like SQL Alchemy. You're not going to find much support for > cementing an imperfect abstraction right into the Python grammar. In order to > make the abstraction relatively complete, you'd need to almost complete merge > ANSI SQL grammar into Python grammar, which sounds terrifying. Don’t see a problem here, expect for a performance problem. I.e. you’ll be able to write queries of any complexity in PythonQL, and most of the work will be pushed into the underlying database. Stuff that can’t be pushed, will be finished up at the Python layer. We don’t have to guarantee the other direction - i.e. if a DBMS has transitive closure for instance, we don’t have to support it PythonQL. > > Third thought: is the implementation of a "Python query language" as generic > as the name implies? The docs mention support for document databases, but I > can run Redis queries? LDAP queries? DNS queries? We definitely can support Redis. LDAP and DNS - don’t know if we want to go there, I would stop at databases for now. > > > We haven't build a real SQL Database wrapper yet, but in the meanwhile you > > can use libraries like psycopg2 or SQLAlchemy to get data from the database > > into an iterator, and then PythonQL can run on top of such iterator. > > Fourth thought: until PythonQL can abstract over a real database, it's far > too early to consider putting it into the language itself. These kinds of > "big change" projects typically need to stabilize on their own for a long > time before anybody will even consider putting them into the core language. We’re definitely at the start of this, because we have huge plans for PythonQL, including a powerful planner/optimizer and wrappers for most popular DBMSs. If we get the support of the Python community though it would help us to move faster for sure. > > Finally – to end on a positive note – the coolest part of this project from > my point of view is using SQL as an abstraction over in-memory objects or raw > files. I can see how somebody that is comfortable with SQL would prefer this > declarative approach. I could see myself using an API like this to search a > Pandas dataframe, for example. I think if we get this right, we might unlock some cool new usages. I really believe that if we simplify integration of multiple data sources sufficiently, a lot of dirty work of data scientists will become much simpler. > > Cheers, > Mark > > On Fri, Mar 24, 2017 at 11:10 AM, Pavel Velikhov <pavel.velik...@gmail.com > <mailto:pavel.velik...@gmail.com>> wrote: > Hi folks! > > We started a project to extend Python with a full-blown query language > about a year ago. The project is call PythonQL, the links are given below in > the references section. We have implemented what is kind of an alpha version > now, and gained some experience and insights about why and where this is > really useful. So I’d like to share those with you and gather some opinions > whether you think we should try to include these extensions in the Python > core. > > Intro > > What we have done is (mostly) extended Python’s comprehensions with group > by, order by, let and window clauses, which can come in any order, thus > comprehensions become a query language a bit cleaner and more powerful than > SQL. And we added a couple small convenience extensions, like a We have > identified three top motivations for folks to use these extensions: > > Our Motivations > > 1. This can become a standard for running queries against database systems. > Instead of learning a large number of different SQL dialects (the pain point > here are libraries of functions and operators that are different for each > vendor), the Python developer needs only to learn PythonQL and he can query > any SQL and NoSQL database. > > 2. A single PythonQL expression can integrate a number of > databases/files/memory structures seamlessly, with the PythonQL optimizer > figuring out which pieces of plans to ship to which databases. This is a cool > virtual database integration story that can be very convenient, especially > now, when a lot of data scientists use Python to wrangle the data all day > long. > > 3. Querying data structures inside Python with the full power of SQL (and a > bit more) is also really convenient on its own. Usually folks that are > well-versed in SQL have to resort to completely different means when they > need to run a query in Python on top of some data structures. > > Current Status > > We have PythonQL running, its installed via pip and an encoding hack, that > runs our preprocessor. We currently compile PythonQL into Python using our > executor functions and execute Python subexpressions via eval. We don’t do > any optimization / rewriting of queries into languages of underlying systems. > And the query processor is basic too, with naive implementations of > operators. But we’ve build DBMS systems before, so if there is a good amount > of support for this project, we’ll be able to build a real system here. > > Your take on this > > Extending Python’s grammar is surely a painful thing for the community. We’re > now convinced that it is well worth it, because of all the wonderful > functionality and convenience this extension offers. We’d like to get your > feedback on this and maybe you’ll suggest some next steps for us. > > References > > PythonQL GitHub page: https://github.com/pythonql/pythonql > <https://github.com/pythonql/pythonql> > PythonQL Intro and Tutorial (this is all User Documentation we have right > now): https://github.com/pythonql/pythonql/wiki/PythonQL-Intro-and-Tutorial > <https://github.com/pythonql/pythonql/wiki/PythonQL-Intro-and-Tutorial> > A use-case of querying Event Logs and doing Process Mining with PythonQL: > https://github.com/pythonql/pythonql/wiki/Event-Log-Querying-and-Process-Mining-with-PythonQL > > <https://github.com/pythonql/pythonql/wiki/Event-Log-Querying-and-Process-Mining-with-PythonQL> > PythonQL demo site: www.pythonql.org <http://www.pythonql.org/> > > Best regards, > PythonQL Team > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org <mailto:Python-ideas@python.org> > https://mail.python.org/mailman/listinfo/python-ideas > <https://mail.python.org/mailman/listinfo/python-ideas> > Code of Conduct: http://python.org/psf/codeofconduct/ > <http://python.org/psf/codeofconduct/> > >
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/