Re: [CodeGen] Performance values

Christofer Dutz Mon, 10 Jun 2019 08:37:47 -0700

As I mentioned, I already have something ...
I'll commit it immediately ... 
I took the Antlr3 spec I once wrote myself (I know it's not perfect, but it 
should be more than enough for our case).
Turned out that it was actually pretty easy to convert it to Antlr4.


Chris

Am 10.06.19, 17:27 schrieb "Julian Feinauer" <[email protected]>:

    Hi Chris,
    
    Thanks for the clarification and the 'minimal spec'.
    I'll try to hack something together for that.
    
    Where does the 'value' in your example come from?
    
    J
    
    Von meinem Mobiltelefon gesendet
    
    
    -------- Ursprüngliche Nachricht --------
    Betreff: Re: [CodeGen] Performance values
    Von: Christofer Dutz
    An: [email protected]
    Cc:
    
    Actually,
    
    the operations you are concerned about are the only ones we don't need ;-)
    The parsing and serialization has some expressions in the field 
definitions. Currently they are:
    payload.lengthInBytes + 4
    lengthInBytes - (payload.lengthInBytes + 1)
    (headerLength + 1) - curPos
    Rest
    parameter.lengthInBytes
    payload.lengthInBytes
    items.size()
    numItems
    address.lengthInBytes
    lengthInBytes == 8
    parameter.numItems
    
    so you see it's not that complicated ... and I actually don't want to run 
any expression evaluator, but I want a parser so I can used the expression 
language ast to transform these expressions to something that can be run 
natively:
    
    lengthInBytes - (payload.lengthInBytes + 1)
    
    would be translated into the following Java code:
    
     value.getLengthInBytes() - (value.getPayload().getLengthInBytes + 1)
    
    JEXL is already quite potent and fast, but still using a runtime for 
interpreting this stuff at runtime costs a lot more than native code.
    
    Chris
    
    
    
    
    Am 10.06.19, 13:52 schrieb "Julian Feinauer" <[email protected]>:
    
        Hi Lukasz,
    
        good point.
        In fact, one reason why I initially favored to define these types in 
some kind of programming Language (python?!) was to avoid the creation of the 
N+1-th expression language.
        And those two where only examples.
        In facto you are right that we need to implement "special" operators 
(">>", "<<") for our scenarios.
    
        But nonetheless, it’s a pain to bother users with another language that 
does everything a bit cleverer than everyone else before (tm) but I see no way 
around it.
        Or has anyone else an idea?
    
        One idea could be to use a "known" language like Javascript or 
python... but that’s also not perfect, I think.
    
        Julian
    
        PS.: There are nice ways to make an IntelliJ language plugin as well as 
VS Code plugins from ANTLR grammars, this is something we should definetly 
address when we are settled.
    
        Am 10.06.19, 13:44 schrieb "Łukasz Dywicki" <[email protected]>:
    
            Keep in mind that Camel's "simple language" is N-th incarnation of 
same
            thing. As far I remember initial version of it was mixture (or 
rather
            torture) of reflection API combined with bunch of if-then-else
            statements. It was refactored to proper expression language between
            versions.
    
            In case of drivers we have lots of bit operations which are not even
            expressed in SpEL or Camel. We are lucky that syntax (or rules) for
            these are already known and used elsewhere.
            Both Spring and Camel work at "object" level thus we might reuse 
their
            experiences for higher level, where actual frame is constructed and 
we
            have linked parts such your examples with size. A difficult part 
will
            come with nested structures since none of Camel or SpEL works with
            these, yet frames can be enveloped multiple times. We need to 
retain an
            open way to let data and expressions pass through transport/network
            (TPDU/NPDU), protocol and application layers (APDU).
    
            If we leave an API/SPI for expression evaluation then anyone can 
pick a
            defined expression, transform it to own language, and let it be
            evaluated at runtime.
    
            Cheers,
            Łukasz
            --
            ConnectorIO http://connectorio.com
    
            On 10.06.2019 11:43, Julian Feinauer wrote:
            > Hi Chris,
            >
            > I would really really really (...) love to jump in... but I'm a 
bit stressed out at the moment.
            > My approach would be to have a well specified language (similar 
to Spring EL, or Camels Simple Language) which is "well documented" and parsed 
via ANTLR (4!) to one of my ASTs because I can then generate "bodies" in 
arbitrary languages.
            > This should take about one day, I think, so perhaps I can 
contribute something soon.
            >
            > Perhaps it would be good to specify what we want to support and 
what the "default" methods are... like "getSize()" or so.
            >
            > Julian
            >
            > Am 10.06.19, 11:38 schrieb "Christofer Dutz" 
<[email protected]>:
            >
            >     Hi all,
            >
            >     While thinking of it, I remembered that abut 10 years ago I 
once created exactly such a grammar, but as ANTLR3 version.
            >     So what I'll do, is create a new grammar which is aimed only 
as Expression language cause I think this could be something useful in general.
            >     Perhaps you all could give some feedback as soon as I've got 
something.
            >
            >     Chris
            >
            >     Am 10.06.19, 10:40 schrieb "Christofer Dutz" 
<[email protected]>:
            >
            >         Hi all,
            >
            >         so on my trip home a few days ago I managed to also get 
the serialization running. I now am able to parse a byte message
            >         into a model and deserialize the model back to bytes and 
the byte arrays are equal. However the serialization performance
            >         I am not that happy with as it takes quite a lot longer 
to serialize than to parse, which shouldn't be the case.
            >
            >         The main reason is, that while simply reading the 
implicit fields during the parsing, when serializing them, a lot of
            >         Evaluations executions have to be performed.
            >         They are usually quite simple expressions such as this:
            >                 exItems = 
jexl.createExpression("parameter.numItems");
            >
            >         The best option would be to improve the antlr grammar to 
parse the expressions a little more formally correct and to implement a model 
for these expressions and have them automatically translated to code like:
            >                this.getParameter().getNumItems();
            >
            >         It should be possible and a lot faster ... anyone up for 
the challenge? @Julian? .. could you please help with this?
            >         As you did that great job with the initial spec ANTLR 
grammar.
            >
            >         Chris
            >
            >
            >
            >         Am 05.06.19, 10:09 schrieb "Christofer Dutz" 
<[email protected]>:
            >
            >             Hi all,
            >
            >             In the train today I'll be working on the 
serialization (Which will be a challenge)
            >             But I am sure this will be a lot of hard work but 
also a great step forward.
            >
            >             Is there any progress on the Driver-Logic generation 
front?
            >
            >             Otherwise I would probably try to whip up a 
hand-written Netty layer using the generated model.
            >             Without all the parser/serializer code this should 
only be a fragment of the existing driver code.
            >
            >             Chris
            >
            >
            >
            >
            >             Am 05.06.19, 09:59 schrieb "Strljic, Matthias Milan" 
<[email protected]>:
            >
            >                 Hura sounds nice 😉
            >                 I hope I find time to play a bit around with it 
in the next few days.
            >
            >                 Greetings
            >                 Matthias Strljic, M.Sc.
            >
            >                 Universität Stuttgart
            >                 Institut für Steuerungstechnik der 
Werkzeugmaschinen und Fertigungseinrichtungen (ISW)
            >
            >                 Seidenstraße 36
            >                 70174 Stuttgart
            >                 GERMANY
            >
            >                 Tel: +49 711 685-84530
            >                 Fax: +49 711 685-74530
            >
            >                 E-Mail: [email protected]
            >                 Web: http://www.isw.uni-stuttgart.de
            >
            >                 -----Original Message-----
            >                 From: Christofer Dutz <[email protected]>
            >                 Sent: Tuesday, June 4, 2019 4:16 PM
            >                 To: [email protected]
            >                 Subject: [CodeGen] Performance values
            >
            >                 Hi all,
            >
            >                 so as I mentioned in Slack yesterday I was able 
to successfully parse a S7 packet with the code generated by the code-generator.
            >                 There I am using Apache Jexl for evaluating the 
expressions we are using all over the place. It got things working quite easily.
            >                 However my gut-feeling told me all these Jexl 
evaluators I’m creating can’t be that ideal.
            >
            >                 But not wanting to pre-maturely optimize 
something that’s already good, I did a measurement:
            >
            >                 So I did a little test, in which I let my parser 
parse one message 20000 times.
            >
            >                 It came up with an average time of 0,8ms … this 
didn’t sound too bad compared to the about 20ms of the interpreted daffodil 
approach.
            >                 But is this already good? It’s probably not ideal 
to compare some results with the ones we know are bad, instead I wanted to 
compare it to the ones we are proud of.
            >
            >                 In order to find out I whipped up a little manual 
test with the existing S7 driver.
            >                 For this I simply plugged the 3 layers together 
with an Embedded channel and used a custom handler at the end to return the 
message.
            >                 This seems to work quite nicely and I was able to 
run the same test with the Netty based S7 driver layers we have been using for 
the last 2 years.
            >
            >                 The results were:
            >                 Parsed 20000 packets in 796ms
            >                 That's 0.0398ms per packet
            >
            >                 So this is a HUGE difference .
            >
            >                 As one last check I ran JProfiler over the 
benchmark and it confirmed that 87% of the time was used by Jexl.
            >                 However the creation of Jexl expressions, not 
their evaluation.
            >                 So one optimization I’ll try is to do, is to have 
the expressions created statically and then to simply reference them.
            >                 This will increase the complexity of the 
template, but should speed things up.
            >                 And I’ll also change the code I’m generating for 
the Type-Switches to work without jexl.
            >
            >                 Simply assuming this would eliminate the time 
wasted by jexl (great simplification), we would reduce the parse time to 0,1ms 
which is still about 3 times that of the existing driver.
            >                 I am assuming that this might be related to the 
BitInputStream I am using … but we’ll deal with that as soon as we’re done with 
getting rid of the time wasted by Jexl.
            >
            >                 So far an update on the generated-drivers front.
            >
            >                 Chris
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >

Re: [CodeGen] Performance values

Reply via email to