Re: Normalizations in MLBase

2014-06-13 Thread Aslan Bekirov
Thanks a lot DB. I will test it and let you know the results. BR, Aslan On Fri, Jun 13, 2014 at 12:34 AM, DB Tsai wrote: > Hi Asian, > > I'm not sure if mlbase code is maintained for the current spark > master. The following is the code we use for standardization in my > company. I'm intended

Re: Normalizations in MLBase

2014-06-12 Thread DB Tsai
Hi Asian, I'm not sure if mlbase code is maintained for the current spark master. The following is the code we use for standardization in my company. I'm intended to clean up, and submit a PR. You could use it for now. def standardize(data: RDD[Vector]): RDD[Vector] = { val summarizer = new

Re: Normalizations in MLBase

2014-06-12 Thread Aslan Bekirov
Hi DB, I found a piece of code that uses znorm to normalize data. /** * build training data set from sample and summary data */ val train_data = sample_data.map( v => Array.tabulate[Double](field_cnt)( i => zscore(v._2(i),sample_mean(i),sample_stddev(i)) ) ).cache Please make you

Re: Normalizations in MLBase

2014-06-12 Thread Aslan Bekirov
Thanks a lot DB. I will try to do Znorm normalization using map transformation. BR, Aslan On Thu, Jun 12, 2014 at 12:16 AM, DB Tsai wrote: > Hi Aslan, > > Currently, we don't have the utility function to do so. However, you > can easily implement this by another map transformation. I'm worki

Re: Normalizations in MLBase

2014-06-11 Thread DB Tsai
Hi Aslan, Currently, we don't have the utility function to do so. However, you can easily implement this by another map transformation. I'm working on this feature now, and there will be couple different available normalization option users can chose. Sincerely, DB Tsai -

Normalizations in MLBase

2014-06-11 Thread Aslan Bekirov
Hi All, I have to normalize a set of values in the range 0-500 to the [0-1] range. Is there any util method in MLBase to normalize large set of data? BR, Aslan