On Monday, February 16, 2015 at 10:54:10 AM UTC+1, Luigi Dell'Aquila wrote: > > Hi, > > OrientDB 2.0 was released just a few week ago, but we are already working > hard to the next release. > > You know that two of the oldest pieces of code inside OrientDB core are > the SQL parser and the query executor, and you know that both have some > flaws. > Some of you struggled to make queries work with additional blank > spaces(see https://github.com/orientechnologies/orientdb/issues/3559 and > https://github.com/orientechnologies/orientdb/issues/3519 ) > or to pass named/unnamed parameters to functions and inner queries (see > https://github.com/orientechnologies/orientdb/issues/1069). > > After releasing 2.0 we started a new development path to have: > - strict query language definition > - strict query validation with better syntax error messages > - full support to prepared statements and parameters (also where now it > fails) > - new command/query executor structure (stateless, less memory consuming) > - better query execution plans and index management > - some additional operators (suspense here ;-) you have to wait for 3.0) > > This path will end with 3.0, but in 2.1 we are releasing the first step, > that is the new query parser. > > The new parser is based on well known JavaCC. It processes all the SQL > statements and produces a tree structure that will be the future query > executor. So the final query execution structure will be the following: > > 1. Get the SQL statement > 2. look for a cached executor > 3. if it exists goto 7 > 4. parse the statement > 5. optimize the statement based on schema and indexes > 6. put it in the statement cache > 7. run the (new) executor with input parameters > 8. send the (streamed) result to the client > > At this step, the SQL parser is ready, but the query executor is too > complex to be refactored in time for 2.1, so we decided to go with a hybrid > solution, to let you have a taste of the new features. > Here's how the query execution changes in 2.1: > > 1. Get the SQL statement > 2. pre-parse the statement with the new SQL parser <- this gives you > strict SQL syntax and much better syntax errors > 3. replace named/unnamed parameters <- this solves problems in parameter > passing to subqueries, functions etc. > 4. rewrite the query based on the parsed structure <- this solves problems > with white spaces and so on... I know it's dirty stuff, but the old > executor is too coupled with the old parser to be completely split in such > a short time > 5. use the old parser to parse the rewritten query and generate the old > executor <- a very small fraction of old parsing problems will remain for > now, sorry... > 6. run the (old) executor > > > ** AN EYE TO THE PAST ** > > Advantages in the short term are obvious, but you can also experience > small problems in the migration, eg. > > - old parser lets you write things like > SELECT FROM FOO ORDER BY A ASC ORDER BY A DESC GROUP BY A ORDER BY A DESC > or even > SELECT FROM FOO ORDER BY A ASC, , , , A ASC > with the new parser you will not be allowed to do this, now every clause > has its own position in the statement (see docs) > > - old parser supports some old (deprecated) operators like traverse() > function (different from TRAVERSE statement) > eg. SELECT FROM Profile WHERE any() traverse(0,3) (city = 'Rome') > This syntax will not be supported anymore, so with the new parser you will > get a syntax error > > - in old parser IN and CONTAINS operators validation is somehow lazy, eg. > you can write > SELECT FROM FOO WHERE a IN 'bar' > but it's supposed to be incorrect, because 'bar' is a string, not a > collection, so the new parser will throw an exception > > > ** AND HOW ABOUT BACKWARD COMPATIBILITY...? ** > > To allow a smooth transition, we decided to apply the following policy to > 2.1: > - old databases will continue to work with the old parser > - databases created with 2.1 will have the new parser enabled by default > - for backward compatibility, you will be able to enable/disable the new > parser (for now) setting the "strictSql" parameter to the db (ALTER > DATABASE docs will be updated shortly) > > In next releases we will probably drop a lot of old code, so I suggest you > to get used to the new strict SQL validation ;-) > > ** HOW CAN I TRY IT NOW? ** > > All this is now in a GIT branch named "javacc_parser", but today we will > merge it in "develop" branch, so you will be able to try it in next > 2.1-SNAPSHOT releases. > > FOR NOW the new parser is enabled only on SELECT and TRAVERSE statements, > in next days we will also cover UPDATE, INSERT, DELETE, CREATE EDGE, CREATE > VERTEX and so on... > > If you have some time, please try it out and give us your feedback, WE > NEED YOUR HELP TO MAKE IT PERFECT!!! > > Thanks > > Hi Luigi,
in a fairly simple app, I see performance drop between 2.0.9 and 2.1-SNAPSHOT, mainly due to time spent in orient parser classes,... 5000 req/sec drops to 3400 req/sec (34% of time is spent in query parsing/tokenizing). This is a small dataset and whole request is executing only one query, but still may be worth investigating... cheers /m > Luigi > > > > -- > Luigi Dell'Aquila > Orient Technologies LTD > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
