-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 22.04.2014 17:35, George Kadianakis wrote: > Enjoy GSoC :)
I will :) > BTW, looking again at your proposal, I see that you are going to > do both popularity tracking and backlinks. Yes, another crawler gathers backlinks from the public WWW and I will start gathering the URL clicks from the users. > How are these two technologies going to interact with each other? > That is, how will the indexer consider the output of those two > features? Django front-end re-sorts the answers from YaCy back-end. See https://ahmia.fi/static/gsoc/re_sort.jpg I have this idea in mind: https://ahmia.fi/static/gsoc/sorter.py The result is sorted according to YaCy result index, number of backlinks and clicks which are scaled. Note the scaling: p_info.backlinks = 1 / (float(index) + 1) etc. sum_function = 3.0*self.yacy + 2.0*self.backlinks + 1.0*self.clicks where 3, 2 and 1 are test coefficients. I will optimize these and made a better model if necessary. However, clicks are easily spoofed and there have to be small coefficient for them. > Also, with your newly acquired knowledge about backlinks, how long > is it going to take your incorporate them in ahmia? Are you > actually going to do it during the "Use an another crawler to > search .onion pages from the public Internet" phase? We can test it when popularity tracking and backlinks crawler are working. - -Juha -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTWKhsAAoJELGTs54GL8vA+WAH/1i4sCvvcwotn5b39Ox8yldn Wv6mBxqlIiaoeBj1Eeu+A92QfGvvpxdWDb7Kn3+3u0IO0wXcZlf0SrIri11IgprW 1f8x5BMDYiaFl12dVO/3jfXSmdfKQ24AdKknfK9wuD63266L2Tks/DVURHQKrYaM zTfYJKZNWJtOPxUj45lHknHxDWVzRlmqiksRn1aPwx2EW5dpKCCVkV9ySnJdZW74 DWs1es1rLKj6UVmVl6w88PJ/C1COWhMQspXtYIZ8paZQfMHtEgDxLuifITIHgdBh TdGLUEVteUl5wyCNjDh1Q+ZEkdbMvcpNZuP5D3lUYweHz0cMMOGHC0oaLlJS4KE= =48jK -----END PGP SIGNATURE----- _______________________________________________ tor-dev mailing list [email protected] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
