Hi Jim,

On Wed, Jun 27, 2012 at 7:27 PM, jim holtman <[email protected]> wrote:
> One place to start is to use Rprof to see where time is being spent.
> I used the sample you sent and this is what I got:
>
>
>  0  16.7 root
>  1.   16.2 system.time
>  2. .   16.1 testfoo
>  3. . .   16.1 setdiff
>  4. . . .    8.2 as.vector
>  5. . . . .    8.2 findSubsets
>  6. . . . . .    6.4 increment
>  7. . . . . . .    4.2 as.vector
>  8. . . . . . . .    3.6 outer
>  9. . . . . . . . .    0.3 rep.int
>  7. . . . . . .    1.6 c
>  7. . . . . . .    0.2 max
>  4. . . .    7.9 unique
>  5. . . . .    7.3 match
>  5. . . . .    0.3 unique.default
>  1.    0.5 sort
>  2. .    0.5 standardGeneric
>  3. . .    0.3 sample
>  3. . .    0.2 sort
>  4. . . .    0.2 sort.default
>  5. . . . .    0.2 sort.int
>
> Of the 16.7 seconds to execute the code, 16.1 was taken up in
> 'setdiff'.  Maybe there is some other way you can determine the
> difference.  So if you continue to use 'setdiff', it does not look
> like there is much that can be done.

One thing to notice is that setdiff() is part of the while() loop.

I could in principle loop over the entire vector and eliminate (all)
the derived numbers at the end, but I have a hunch it might take even
longer. The point of setdiff() was to progressively shorten the vector
in order to minimize the time spent in the loop. On the other hand,
setdiff() overwrites the vector at each iteration and that of course
also takes time.

I thought a C program might prove to be faster (because of the faster
looping over each value in the vector), but although it works just
fine it seems I am unable to properly use C, given the similar long
time spent (probably because of toying with the memory too much).

Well, any other quicker alternative would do...
Thanks,
Adrian

-- 
Adrian Dusa
Romanian Social Data Archive
1, Schitu Magureanu Bd.
050025 Bucharest sector 5
Romania
Tel.:+40 21 3126618 \
       +40 21 3120210 / int.101
Fax: +40 21 3158391

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to