Seriously, got to hell.
You know what s the problem with you, whenever one is motivated,
to fix a problem with rock and it not done the way you think it is the
best, you start bashing around. Btw, a correct response would be, great,
you looked into it and made it faster, go on.

A lot of people think that the log replay tools are slow as hell, and that
it is annoying to wait 20 seconds every time rock replay starts up.

Just to finish this, it is slow as hell, I just did tests with logdata I generated
using Mars :
time rock-replay ~/Arbeit/asguard/bundles/asguard/logs/current
pocolog.rb[INFO]: building index ...
pocolog.rb[INFO]: done
pocolog.rb[INFO]: building index ...
pocolog.rb[INFO]: done
pocolog.rb[INFO]: building index ...
pocolog.rb[INFO]: done
pocolog.rb[INFO]: building index ...
pocolog.rb[INFO]: done
pocolog.rb[INFO]: building index ...
pocolog.rb[INFO]: done
Aligning streams. This can take a long time
pocolog.rb[INFO]: Got 77 streams with 295166 samples
pocolog.rb[INFO]: Stream Aligner index created

real    0m20.860s
user    0m19.605s
sys     0m1.008s

time ./multiIndexer ~/Arbeit/asguard/bundles/asguard/logs/current/*.log
Building multi file index
 100% Done
Processed 295169 of 295169 samples

real    0m1.089s
user    0m0.780s
sys     0m0.304s

This is a hug speedup, and it is worth it.
    Janosch


On 09.06.2014 17:04, Sylvain Joyeux wrote:


        Created a dataset of one minute with 100 streams. Each stream
        is at 100Hz, so that's 600k samples. It took 4.6 seconds to
        generate the index and 0.8 seconds to load the file index
        (from warm cache, so with probably little I/O overhead).

    How long did the stream alignment take ? This is the part were
    usually the problem is, as you can't get better than
    O((log n)*s) there, were n is the number of streams and s the
    amount of samples.

??? What are you talking about ? This is only the asymptotic curve. The alignment takes 4.6 seconds.



        C++ *is* faster. Of course it is. From what I see, not fast
        enough to justify the refactoring that you are proposing.

    Ohh yes, it does. Recently I did a log of localization debugging.
    You can't jump data in this case (and a lot of other usecases too)
    which means you have to replay the whole logstream. If the replay
    is double as fast, it means you need half
    the time for debugging. So in my eyes it is 100% worth the effort.

Except that making twice as fast the part that is currently taking 10% of the replay time only will make the overall process 5% faster. Even making it 100 times faster will only save 9%. This is from what I see what you are attempting, as what takes the most time is I/O and typelib demarshalling.

In other words: you are attempting to optimize something without having done any profiling. This is a cardinal sin.

        Again, you are *not* giving the right measurements. Speed
        factors and durations are meaningless if we don't know how
        many samples each stream has, and how long each stream lasts.
        Just "it is 24x times faster" means nothing.

    You got the C++ implementation, just run multiIndexTester on your
    testdata and compare the results.


Sylvain

_______________________________________________
Rock-dev mailing list
[email protected]
http://www.dfki.de/mailman/cgi-bin/listinfo/rock-dev

Reply via email to