Le 11/11/2016 à 11:29, Stephan Eggermont a écrit :
On 10/11/16 21:35, Igor Stasenko wrote:
No, no, no! This is simply not true.
It is you, who writes the code that generates a lot of statistical
data/analysis data, and its output is fairly predictable.. else you are
not collecting any data, but just a random noise, isn't?
That would be green field development. In brown field development, I
only get in when people start noticing there is a problem (why do we
need more than 4GBytes for this?). At that point I want to be able to
load everything they can give me in an image so I can start analyzing
and structuring it.
I mean, Doru is light years ahead of me and many others in field of data
analysis.. so what i can advise to him on his playground?
Well, the current FAMIX model implementation is clearly not well
structured for analyzing large code bases. And it is difficult to
partition because of unpredictable access patterns and high
interconnection.
This is why you look for a general purpose, efficient off-loading
scheme, trying to optimize a general case and get reasonable performance
out of it (a.k.a fuel, but designed for partial unloading / loading:
allow dangling references in a unit of load, focus on per-page units to
match the underlying storage layer or network).
I wrote one such layer for VW a long time ago, but didn't had time to
experiment / qualify some of the techniques in it. There was an
interesting attempt (IMHO ... wasn't qualified) at combining paging and
automatic refinement of application working set, based on previous
experience implementing a hierarchical 2D object access scheme for large
datasets on slow medium (decreased access time from 30 minutes to about
a few seconds).
The other approach I would look is take some of the support code for
such an automatic layer and use it to unload parts of my model;, and I'm
pretty sure that, if I don't bench intensively, I'll get the
partitioning wrong :(
Overall, an interesting subject, totally not valid from a scientific
point of view (the database guys have already solved everything). Only
valid as a hobby, or if a company is ready to pay for a solution.
Thierry