Glad to hear that it runs fast now. A cursory look at the code indicates (correct me if wrong) that DailyProjectUnitTest performs two distinct functions:

(1) Functions used by reducers: getUnitTestInfo(filePattern), getUnitTestInfo(filePattern, user).
    They use "this.cache".

(2) Functions used by daily project analysis: getSummaryStrings(user), getDrillDowns(user)
     They use:
           private Map workspaceUserDataMap = new ConcurrentHashMap();
           private Map workspaceDataMap = new ConcurrentHashMap();
           private Map userDataMap = new ConcurrentHashMap();
           private Data totalData;

Two different caches without overlapping. The constructor calls "buildWorkspaceUserData()" populating daily project analysis cache eagerly, meaning penalty for telemetry users.

Why we have to put them together? Why not having two types of daily project objects, one for telemetry style analysis, one for daily project details style analysis. My 2 cents.

Cheers,

Cedric



Hongbing Kou wrote:
Hi, Philip & Cedric,

I revamped DailyProjectUnitTest. Now it takes only 6 seconds to compute both summary and detail to day 4/5/2006 on my desktop, it was 124 seconds yesterday. The overly use of FilePattern match is the key to performance regression. It made daily project analysis 20 times slower.
Cedric will have to revisit other DailyProjectData for performance.

Philip's idea to take out project workspaces' loop is hard to implement, although it is doable. Here is the reason that it is more complicated than we think. Suppose we have file,

C:\svn\hackystat\hackyCore_Kernel\src\org\hackystat\kernel\util\DataInfo.java

Workspace root is C:\svn\hackystat, then the trimmed path will be

       hackyCore_Kernel\src\org\hackystat\kernel\util\DataInfo.java

We can get top-level workspace hackyCore_Kernel\. Although it looks the same as project Hackystat-7's workspace hackyCore_Kernel\, they are two different instances. With current implementation, we still have to loop through this project's all workspaces to compare. In order to achieve O(1) time complexity, we may have to introduce the hashcode on case-insensitive
trimmed path. I will get back to this when I have time.

Cheers,
Hongbing

Reply via email to