On Thu, Jan 15, 2015 at 5:23 AM, Miguel Angel Martin junquera < mianmarjun.mailingl...@gmail.com> wrote:
> My question is:.. > Is it better to scale up these dimensions directly in the tf-idf > sequence final mix file using this correction factors OR first do scale > up in each tf-vectors and then mix vectors and recalculate the tf-idf > final to minimize errors or desviations in a subsequent clustering > from this tf-idf final mix vectors. > Mathematically it doesn't matter whether you scale the vectors at generation time or before computing distance or by scaling during the distance computation. Different places for the change may be more or less easy in terms of programming. The two easiest places tend to be at the beginning (if you know the weights) since you have to write that code anyway, or at the end since there are provisions for changing the metric in some programs.