Oleg has pointed me into the right direction: He has suggested to use kind of class AttrOk elTrype attr for attributes (result 35s -> 30s) and do something similar with the state. (30s -> 4,5s). After doing this I was quite happy: The 4,5s do include reading the dtd and generating about 1500 DecQ declarations. There might still be some duplication.. in state transformation steps. However trying to typecheck the same file repeating the body 400 times didn't end.. why?
Have a quick glance at the file data which shows quadratic behaviour. That's bad.. I would like to have some linear behaviour. Any ideas what is causing this? Is there a way to get linear scalability? However disabling all validation stuff (see cabal flag) no longer improves performance much. The data below proofs that right now validation increases compilation time by a factor 1.35- 1.45 (within the range of 5-30 replications of the body) If you'd like to play with the library and give some feedback I'd be happy. Read the README. The benchpress dependency is only needed for this benchmark test. You may want to remove it from the cabal file. So don't try to compile 4000 lines long xml files or be prepared to wait days. I should expand the benchmark to also offer results for xhtml lib. Sincerly Marc Weber ============= compilation times with validation ============================================== body replication count | compilation time [ms] 1 4146.477 2 4292.153 3 4508.56 4 4654.244000000001 5 4788.195 6 5041.674999999999 7 5347.1140000000005 8 6134.5960000000005 9 6019.624000000001 10 6459.544 11 7054.433999999999 12 7614.197 13 8489.003 14 8529.610999999999 15 9271.491 16 10058.419 17 12290.142 18 13736.074999999999 19 14863.893 20 15944.82 21 17856.611999999997 22 17977.841 23 17686.297 24 19279.314 25 20960.785 26 22750.754 27 24407.506 28 26342.242 29 28423.79 30 30932.777000000002 31 48478.841 32 45609.897 33 40574.255 34 41220.062 35 43952.545999999995 36 47437.922 37 50584.12100000001 38 53983.848000000005 39 57935.593 ============= ======================================================= eg starting gnuplot entering f(x)=(x-b)**2*c+a fit f(x) 'data' via a, b, c plot 'data', f(x) ============= compilation times without validation ================================ body replication count | compilation time [ms] 1 3939.887 << Import.hs has been recompiled. thats why the first took longer then the next 2 3138.127 3 3080.317 4 3179.19 5 3279.339 6 3604.609 7 3736.829 8 4094.6179999999995 9 4252.377 10 4767.588 11 5070.188 12 5537.007 13 5840.085 14 6478.276 15 6916.67 16 7568.444 17 8301.047 18 9211.368999999999 19 10025.886 20 10872.749 21 12086.263 22 12975.372 23 14366.282 24 15640.214 ============= ======================================================= _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe