After the updates I mentioned in the last e-mail this happens:
[trunk] [~2m 17s]
new 10 7704.7 1.52101e-14 0.00000 1.52101e-14
new 30 9395.2 1.52101e-14 0.00000 1.52101e-14
new 80 9400.5 1.52101e-14 0.00000 1.52101e-14
new 180 12842.9 1.52101e-14 0.00000 1.52101e-14
new 380 5654.4 1.52101e-14 0.00000 1.52101e-14
new 880 6880.4 1.52101e-14 0.00000 1.52101e-14
old 10 36023.5 2.84217e-14 0.00000 2.84217e-14
old 30 55919.1 2.84217e-14 0.00000 2.84217e-14
old 80 58147.7 2.84217e-14 0.00000 2.84217e-14
old 180 54198.4 2.84217e-14 0.00000 2.84217e-14
old 380 37995.1 2.84217e-14 0.00000 2.84217e-14
old 880 44413.5 2.84217e-14 0.00000 2.84217e-14
Speedup is about 6.5 times
[new vector ops] [~1m 35s]
new 10 17086.4 1.44329e-14 0.00000 1.44329e-14
new 30 10960.8 1.44329e-14 0.00000 1.44329e-14
new 80 11734.8 1.44329e-14 0.00000 1.44329e-14
new 180 34976.0 1.44329e-14 0.00000 1.44329e-14
new 380 16007.1 1.44329e-14 0.00000 1.44329e-14
new 880 13113.1 1.44329e-14 0.00000 1.44329e-14
old 10 46757.8 2.65343e-14 0.00000 2.65343e-14
old 30 23111.9 2.65343e-14 0.00000 2.65343e-14
old 80 16987.3 2.65343e-14 0.00000 2.65343e-14
old 180 15822.4 2.65343e-14 0.00000 2.65343e-14
old 380 28304.9 2.65343e-14 0.00000 2.65343e-14
old 880 15262.6 2.65343e-14 0.00000 2.65343e-14
Speedup is about 1.4 times (FAILS)
[new vector ops, before changes] [~3m 7s]
new 10 11642.0 1.14353e-14 0.00000 1.14353e-14
new 30 8169.5 1.14353e-14 0.00000 1.14353e-14
new 80 8446.0 1.14353e-14 0.00000 1.14353e-14
new 180 8429.7 1.14353e-14 0.00000 1.14353e-14
new 380 9316.2 1.14353e-14 0.00000 1.14353e-14
new 880 10924.3 1.14353e-14 0.00000 1.14353e-14
old 10 55476.1 2.59792e-14 0.00000 2.59792e-14
old 30 64453.2 2.59792e-14 0.00000 2.59792e-14
old 80 59954.5 2.59792e-14 0.00000 2.59792e-14
old 180 71600.2 2.59792e-14 0.00000 2.59792e-14
old 380 70933.0 2.59792e-14 0.00000 2.59792e-14
old 880 63348.3 2.59792e-14 0.00000 2.59792e-14
Speedup is about 6.3 times
Which of these is better?
A row is printed out from line 224:
System.out.printf("%s %d\t%.1f\t%g\t%g\t%g\n", label, n, (t1 - t0) /
1.0e3 / n, maxIdent, maxError, warmup);
and the 3rd column seems to be the time.
It fluctuates and doesn't seem to depend on count...
Which of these 3 runs is better?
On Fri, May 3, 2013 at 7:51 PM, Dan Filimon <[email protected]>wrote:
> I think I found out why, for the QR test.
>
> First off, it's stable and not seed dependent (on my machine anyway,
> haven't looked too closely).
>
> Trunk takes about 2 minutes and my new vector branch takes more than 3.
> From what I've seen the problem is twofold:
> - norm1 is still slower in the new code
> - VectorViews suck at iterating through. They create a new
> DecoratorElement for every nonzero (so the index can be adjusted).
> The problem is that when picking the best algorithm, I made
> getIteratorAdvanceCost be the same as the vector being viewed not realizing
> the impact of creating new elements.
>
> I'll get back to you after I:
> - change norm1 to what it used to be
> - tweak the iterator advance cost for vector views
>
> On Fri, May 3, 2013 at 7:23 PM, Ted Dunning <[email protected]> wrote:
>
>> Shouldn't depend on seed.
>>
>> Very odd.
>>
>> Sent from my iPhone
>>
>> On May 3, 2013, at 8:24, Robin Anil <[email protected]> wrote:
>>
>> > QRDecompositionTest: I saw this from time to time. Sometimes it runs in
>> 0.2
>> > seconds sometimes 100s. Seed related?
>> >
>> >
>> >
>> > On Fri, May 3, 2013 at 9:59 AM, Dan Filimon <
>> [email protected]>wrote:
>> >
>> >> QRDecompositionTest.fasterThanBefore() and most of the tests in
>> >> fpm.pfpgrowth take a really long time to run (FPGrowthSyntheticDataTest
>> >> took 98s on my machine).
>> >>
>> >> Could we do something about these?
>> >> Maybe move fasterThanBefore() into a benchmark and out of the tests
>> (like
>> >> VectorBenchmark) and simplify the fpm.* tests somehow?
>> >>
>> >> Thoughts? Thanks!
>> >>
>>
>
>