On Aug 26, 2014, at 9:23 AM, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
> On Tue, 26 Aug 2014 12:16:33 +0200, lavanya addepalli <phani....@gmail.com> > declaimed the following: > >> How can i generate a random data that is identical to my realworld data >> > > By definition, "random data" will be unlikely to ever be "identical" to > your "realworld data". > > >> i am supposed to refer the attached paper >> >> Real Data >> >> node pairs and the time they spend together connected >> >> node node time in seconds >> 4391 2814 16.0 [byte] >> 1885 1158 351.0 >> 1349 1174 6375.0 >> > > Since I see no cases of duplicate node /pairs/ it is difficult to > figure out just what that data really represents... > > With enough data, with duplicate pairs having different times, I'd > likely group by pairs, generate mean and standard deviation for the times > of the matching pairs, then generate some count of the pairs to develop > weights... Finally, using the weights I'd attempt to generate random node > pairs and then use the mean/SD of the result pair to generate a time from > the gaussian distribution. > > With only the data you have, I'd end up with a sparse 2D matrix > M[first_node, second_node] = time > > And then selecting random samples from that... > -- > Wulfraed Dennis Lee Bieber AF6VN > wlfr...@ix.netcom.com HTTP://wlfraed.home.netcom.com/ I think the OP wanted to create some sort of test file that was formatted like his real-world data, but was filled with artificial data. That, of course simply becomes an exercise in creating lists of random 4-digit integers and floats, then arranging them as lines and writing out the data. -Bill -- https://mail.python.org/mailman/listinfo/python-list