I would like to use so-called sliding window protocol to do the
sampling. Namely, I do not want to do the complete analyis, so I use
the sliding window as a substution. Here is an example.
Let we have some element in a set.
{e1, e2, e3, e4, e5, e6, e7, e8, e9, e10}
Suppose I need to take 3 of them to form a subgroup at a time, the
total combinations of should be 10 C 3, i.e. 120. This number is still
managable. However, if the data set become larger, it must be
impossible to have a complete consideration. Therefore, I choose the
sliding window protocol to overcome this.
As we need to form the subgroup at 3 each time, following combinations
could be found.
{e1, e2, e3}, {e2, e3, e4}, {e3, e4, e5}, {e4, e5, e6}, {e5, e6, e7},
{e6, e7, e8}, {e7, e8, e9}, {e8, e9, e10}, {e9, e10, e1}, {e10, e1,
e2}
By doing this we could make sure each of the element appears twice.
Assume the order is not important, i,e, {e1, e2, e3} is equal to {e2,
e1, e3} or {e3, e2, e1} and so on.
The problem comes as this only include less than 10 % in the complete
set, namely, this is 10 out of 120. My concern is the representation
of this 10 combinations. I could not know how to choose any 10 of 120
or n of 120 subgroups.
Could you give me some suggestions to solve this? I just think of 1
solution of randomizing the original set first and slide the windows
afterwards. Even doing that I am afraid this is not a good way, a good
sampling of the data set.
Or is my sliding window concept could not be applied here? If so, what
else could be used to deal with this problem?
Thank you very much!!
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
. http://jse.stat.ncsu.edu/ .
=================================================================