Thanks for your inputs. I'm probably showing my technological age here, but I certainly admit that I have this tendency to avoid repeating complex operations as a matter of principle when it's known in advance that the second process will produce exactly the same result as the first one. When I catch myself doing that I always feel that my design is not OK.
However in this case I am quite sure I need to get rid of the double parsing, although I did not demonstrate in a particularly strict way that that's the cause of the slowdown. It's more like a qualified (in my opinion) guess, reinforced by the fact that method Expression.fromString(String) has a TODO saying "TODO: cache expression strings, since this operation is pretty slow" (I'm using version 3.0.2). So it looks like the Cayenne coders too had reasons to worry to some extent about optimization in this area. I just used JVisualVM to profile the execution and two of the methods where by far most of the time is spent are Expression.fromString(String) and ExpressionParser.getNextToken() . Since I have to cut down the processing time I do have to focus on them first. The situation here is that I modified a preexisting application which was doing some basic parsing, and after creating the tokens from the parsing it was using them to match the expression against objects. That parsing is basic in that it can only parse simple expressions, f.ex. it doesn't support parentheses grouping. My changes consisted of removing that parsing code from the application and replacing it with calls to Cayenne, because we need real parsing. Of course the parsing done by Cayenne is way more powerful and that might be the real and fair reason why it takes longer, but even if this is the case it's important for me not to do that parsing twice. It's not easy to explain properly why I need the tokens; the general reason is that the preexisting application, written long ago by several other persons, is designed to use them, and changing its design would be too big an undertaking. Since all that needs to be improved is the parsing and matching I thought I'd just use a powerful tool to replace only those parts. I will see if I can use Andrus' pointers to extract the tokens from the Expression instance. -----Original Message----- From: Andrus Adamchik [mailto:[email protected]] Sent: Sunday, November 16, 2014 14:57 To: [email protected] Subject: Re: Extracting tokens from an expression and matching an object against that expression without parsing twice I second John's assessment. BTW, what are the tokens for? Do you actually need to have access to the lexical structure of the String? As of course parsed Expression object is a tree itself and gives you access to its own structure either directly ('getOperand(int)') or via 'traverse' and 'transform' methods. Andrus > On Nov 14, 2014, at 9:54 PM, John Huss <[email protected]> wrote: > > This looks like a serious micro optimization. Is the performance for > this really that critical? Have you demonstrated that this is your > application's crucial hot spot? > > On Fri, Nov 14, 2014 at 7:35 AM, Davide Vecchi <[email protected]> wrote: > >> Hi all, >> >> I have an expression in a string, and I use Cayenne to parse the >> expression into tokens, which are needed for a specific purpose. >> >> However in addition to having the tokens I also need to evaluate an >> object against that expression, to see if that object matches the expression. >> >> My problem is that the way I'm doing it causes the parsing to be done >> twice on the same expression, and I would like to avoid to parse the >> same expression twice. >> >> The token creation I'm doing it like this: >> >> ----------------------------------- >> String where = "myField=0"; >> >> Reader reader = new StringReader(where); >> >> ExpressionParser parser = new ExpressionParser(reader); >> >> List<Token> tokens = new ArrayList<>(); >> >> Token token = parser.getNextToken(); >> >> while (token != null) { >> >> tokens.add(token); >> >> token = parser.getNextToken(); >> } >> ----------------------------------- >> >> The object matching I'm doing it like this: >> >> ----------------------------------- >> String where = "myField=0"; >> >> Expression expression = Expression.fromString(where); >> >> boolean matches = expression.match(object); >> ----------------------------------- >> >> The call to Expression.fromString made in the object matching >> operation performs a parsing, but the parsing of the same expression >> had already been done in the token creation operation. >> >> Is there a way to redesign this process in order to get the tokens >> and also match an object against the expression without parsing the >> same expression twice ? >> >> For example, I believe that the call to Expression.fromString must >> have created the tokens, because it has parsed the string. So I >> thought I could reverse the order and do the object matching first, >> keep the Expression instance created in that process and use it to >> extract the tokens. But I can't see how to extract the tokens from an >> Expression instance instead of from an ExpressionParser instance as I'm >> currently doing. >> >> Or another possibility could be that I keep creating the tokens >> first, and then I match my object against them, instead of against >> the string expression that generated those tokens. But I can't see >> how to match an object against tokens. >> >> So I'm looking for some ideas. >> >> Thanks in advance. >> >> Davide Vecchi >>
