First off, I think PythonQL (and PonyORM before it) is a very interesting piece of technology. However, I think some of the answers so far suggest we may need to discuss a couple of meta-issues around target audiences and available technical options before continuing on.
I'm quoting Gerald's post here because it highlights the "target audience" problem, but my comments apply to the thread generally. On 25 March 2017 at 22:51, Gerald Britton <gerald.brit...@gmail.com> wrote: > > I see lots of C# code, but (thankfully) not so much LINQ to SQL. Yes, it is > a cool technology. But I sometimes have a problem with the SQL it generates. > Since I'm also a SQL developer, I'm sensitive to how queries are > constructed, for performance reasons, as well as how they look, for > readability and aesthetic reasons. > > LINQ queries can generate poorly-performing SQL, since LINQ is a basically a > translator, but not an AI. As far as appearances go, LINQ queries can look > pretty gnarly, especially if they include sub queries or a few joins. That > makes it hard for the SQL dev (me!) to read and understand if there are > performance problems (which there often are, in my experience) > > So, I would tend to code the SQL separately and put it in a SQL view, > function or stored procedure. I can still parse the results with LINQ (not > LINQ to SQL), which is fine. > > For similar reasons, I'm not a huge fan of ORMs either. Probably my bias > towards designing the database first and building up queries to meet the > business goals before writing a line of Python, C#, or the language de jour. Right, the target audience here *isn't* folks who already know how to construct their own relational queries in SQL, and it definitely isn't folks that know how to tweak their queries to get optimal performance from the specific database they're using. Rather, it's folks that already know Python's comprehensions, and perhaps some of the itertools features, and helping to provide them with a smoother on-ramp into the world of relational data processing. There's no question that folks dealing with sufficiently large data sets with sufficiently stringent performance requirements are eventually going to want to reach for handcrafted SQL or a distributed computation framework like dask, but that's not really any different from our standard position that when folks are attempting to optimise a hot loop, they're eventually going to have to switch to something that can eliminate the interpreter's default runtime object management overhead (whether that's Cython, PyPy's or Numba's JIT, or writing an extension module in a different language entirely). It isn't an argument against making it easier for folks to postpone the point where they find it necessary to reach for the "something else" that takes them beyond Python's default capabilities. However, at the same time, PythonQL *is* a DSL for data manipulation operations, and map and filter are far and away the most common of those. Even reduce, which was previously a builtin, was pushed into functools for Python 3.0, with the preferred alternative being to just write a suitably named function that accepts an iterable and returns a single value. And while Python is a very popular tool for data manipulation, it would be a big stretch to assume that that was it's primary use case in all contexts. So it makes sense to review some of the technical options that are available to help make projects like PythonQL more maintainable, without necessarily gating improvements to them on the relatively slow update and rollout cycle of new Python versions. = Option 1 = Fully commit to the model of allowing alternate syntactic dialects to run atop Python interpreters. In Hylang and PythonQL we have at least two genuinely interesting examples of that working through the text encoding system, as well as other examples like Cython that work through the extension module system. So that's an opportunity to take this from "Possible, but a bit hacky" to "Pluggable source code translation is supported at all levels of the interpreter, including debugger source maps, etc" (perhaps by borrowing ideas from other ecosytems like Java, JavaScript, and .NET, where this kind of thing is already a lot more common. The downside of this approach is that actually making it happen would be getting pretty far afield from the original PythonQL goal of "provide nicer data manipulation abstractions in Python", and it wouldn't actually deliver anything new that can't already be done with existing import and codec system features. = Option 2 = Back when f-strings were added for 3.6, I wrote PEP 501 to generalise the idea as "i-strings": exposing the intermediate interpolated form of f-strings, such that you could write code like `myquery = sql(i"SELECT {column} FROM {table};")` where the "sql" function received an "InterpolationTemplate" object that it could render however it wanted, but the "column" and "table" references were just regular Python expressions. It's currently deferred indefinitely, as I didn't have any concrete use cases that Guido found sufficiently compelling to make the additional complexity worthwhile. However, given optionally delayed rendering of interpolated strings, PythonQL could be used in the form: result =pyql(i""" (x,y) for x in {range(1,8)} for y in {range(1,7)} if x % 2 == 0 and y % 2 != 0 and x > y """) I personally like this idea (otherwise I wouldn't have written PEP 501 in the first place), and the necessary technical underpinnings to enable it are all largely already in place to support f-strings. If the PEP were revised to show examples of using it to support relatively seamless calling back and forth between Hylang, PythonQL and regular Python code in the same process, that might be intriguing enough to pique Guido's interest (and I'm open to adding co-authors that are interested in pursuing that). Option 3: Go all the way to expanding comprehensions to natively be a full data manipulation DSL. I'm personally not a fan of that approach, as syntax is really hard to search for help on (keywords are better for that than punctuation, but not by much), while methods and functions get to have docstrings. It also means the query language gets tightly coupled to the Python grammar, which not only makes the query language difficult to update, but also makes Python's base syntax harder for new users to learn. By contrast, when DSLs are handled as interpolation templates with delayed rendering, then the rendering function gets to provide runtime documentation, and the definition of the DSL is coupled to the update cycle of the rendering function, *not* that of the Python language definition. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/