We've had a few discussions about this in the past. As 5.0 is getting close to Final (next week), its time to start contemplating our next major tasks. The consensus pick for that has been the idea of a "unified SQL generation engine" along with a shared project for the semantic analysis of HQL/JPQL (and recently it was decided to include JPA Criteria interpretation here as well).
The central premise is this. Take the roughly 6 or 7 different top-level ways Hibernate generates SQL and combine that into one "engine" based on the input of a "semantic tree". The mentioned HQL/JPQL/Criteria shared project will be one producer of such semantic trees. Others would include persisters (for insert/update/delete requests) and loaders (for load requests). We have a lot of tasks for this overall goal still remaining. We still have to finalize the design for the HQL/JPQL/Criteria to semantic tree translator. One option is to proceed with the Antlr 4 based approach I started a PoC for. John has been helping me some lately with that. The first task here is to come to a consensus whether Antlr 4 is the way we want to proceed here. We've been over the pros and cons before in detail. In summary, there is a lot to love with Antlr 4. Our grammar for HQL recognition and semantic tree building is very simple and elegant imo. The drawback is clearly the lack of tree walking, meaning that we are responsible for writing by hand our walker for the semantic tree. In fact multiple, since each consumer (orm, ogm, search) would need to write their own. And if we decide to build another AST while walking the semantic tree, we'd end up having to hand-write yet another walker for those. What I mean by that last part is that there are 2 ways we might choose to deal with the semantic tree. For the purpose of discussion, let's look at the ORM case. The first approach is to simply generate the SQL as we walk the semantic tree; this would be a 2 phase interpretation approach (input -> semantic tree -> SQL). That works in many cases. However it breaks down in other cases. This is exactly the approach our existing HQL translator uses. The other approach is to use a 3-phase translation (input -> semantic-tree -> semantic-SQL-tree(s) -> SQL). This gives a hint to one of the major problems. One source "semantic" query will often correspond to multiple SQL queries; that is hard to manage in the 2-phase approach. And not to mention integrating things like follow-on fetches and other enhancements we want to gain from this. My vote is definitely for 3 or more phases of interpretation. The problem is that this is exactly where Antlr 4 sort of falls down. So first things first... we need to decide on Antlr 3 versus Antlr 4 (versus some other parser solution). Next, on the ORM side (every "backend" can decide this individually) we need to decide on the approach for semantic-tree to SQL translation, which somewhat depends on the Antlr 3 versus Antlr 4 decision. We really need to decide these things ASAP and get moving on them as soon as ORM 5.0 is finished. Also, this is a massive undertaking with huge gain potentials for not just ORM. As such we need to understand who will be working on this. Sanne, Gunnar... I know y'all have a vested interest and a desire to work on it. John, I know the same is true for you. Andrea? Have you had a chance to look over the poc and/or get more familiar with Antlr? _______________________________________________ hibernate-dev mailing list hibernate-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/hibernate-dev