If I have a trainset, some of samples belong to class I, {A1,A2,...,An};
other samples belong to class II, {B1,B2,...,Bm}.

The divergence is defined as d(p1,p2). And there is a generalized form
d(p1,p2,...,pk)

And here comes a test sample t.

There are three ways to compute the divergence between t and trainsets.

1.  use the generalized form, d(t, A1,A2,...,An) and d(t, B1,B2,...,Bm).

2.  use  \Sigma_k d(t,Ak)/n and \Sigma_k d(t,Bk)/m.

3. find \bar{A} that minimizes \Sigma_k d(\bar{A},Ak) and \bar{B} that
minimizes \Sigma_k d(\bar{B},Bk),  then the divergence is d(t,\bar{A}) and
d(t,\bar(B));

Then find which divergence is smaller, and classify the test data into that
class.

1. Which of the 3 ways is correct?
2. Is it the correct to use smaller divergence as clues for classification?

Thanks.

Goshiwen


.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to