If I have a trainset, some of samples belong to class I, {A1,A2,...,An}; other samples belong to class II, {B1,B2,...,Bm}.
The divergence is defined as d(p1,p2). And there is a generalized form d(p1,p2,...,pk) And here comes a test sample t. There are three ways to compute the divergence between t and trainsets. 1. use the generalized form, d(t, A1,A2,...,An) and d(t, B1,B2,...,Bm). 2. use \Sigma_k d(t,Ak)/n and \Sigma_k d(t,Bk)/m. 3. find \bar{A} that minimizes \Sigma_k d(\bar{A},Ak) and \bar{B} that minimizes \Sigma_k d(\bar{B},Bk), then the divergence is d(t,\bar{A}) and d(t,\bar(B)); Then find which divergence is smaller, and classify the test data into that class. 1. Which of the 3 ways is correct? 2. Is it the correct to use smaller divergence as clues for classification? Thanks. Goshiwen . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================