[Pywikipedia-bugs] [Maniphest] [Commented On] T138119: Use user-maintained bot run mode to gain stats and learn

DrTrigon Mon, 20 Jun 2016 08:17:20 -0700

DrTrigon added a comment.

Am 18. Juni 2016 05:31:48 MESZ, schrieb AbdealiJK <[email protected]>:

AbdealiJK added a comment.

This is an interesting question, and the major issue I see here is that
the user's computer will hang if we do use it.

So, the best method may be to have something like what a lot of
softwares do: "Would you like to send usage statistics to the owner to
make the software better"

And then in the next release use the information to create a training
set which is more comprehensive.

I think that could add value and allow us to tweak params on a way bigger database (user experience) and file formats than just ours.
So we should formulate questions we want to answer and then think about what stats we have to store in order to do so. All must be anonymized.

Note that a lot of times a larger training set can make the learning
agent worse. Basically depends on where you wan the hyper plane to be
drawn, etc. So, the training set *needs* to be well curated.

Indeed that is an important point. Just adding a lot of noise makes it worse. The dataset has to suit the data.
The knowledge about persons appearing in data and using that for training might be appropriate for a face matcher, like http://docs.opencv.org/master/dc/dc3/tutorial_py_matcher.html#gsc.tab=0

Dr. Trigon

TASK DETAIL

https://phabricator.wikimedia.org/T138119

EMAIL PREFERENCES

https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: DrTrigon
Cc: jayvdb, AbdealiJK, Aklapper, pywikibot-bugs-list, DrTrigon, Zppix, Lethexie, Jay8g

_______________________________________________
pywikibot-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikibot-bugs

[Pywikipedia-bugs] [Maniphest] [Commented On] T138119: Use user-maintained bot run mode to gain stats and learn

Reply via email to