As a part of my studies for the master degree in information and communication technology (ICT) at Agder College University (http://www.hia.no/english/) I'm involved in a research project named ROBACC which goal is to develop an automated internet spider that assess the accessibility of web pages.
More info about ROBACC can be found at http://osys.grm.hia.no/fou/robacc/robacc.html My involvement in this project is to develop a classifier that can automatically identify the authoring tool used to create any given webpage based on the structure of the html-code. The classifier is based on an implementation of the naive Bayes algorithm. Consequently the classifier needs a rather large set of training data to work correctly. The mentioned training data will in my case consist of html-documents created by the authoring tool I wish to train the classifier to identify. Getting hold of proper trainingdata is difficult, and I am therefore posting this message in hope that you may guide me to websites that are created using the Mozilla editior. These html-documents would then again serve as the very foundation of my classifier. I hope you can help me with this. Best regards Svein Arild _______________________________________________ mozilla-editor mailing list [EMAIL PROTECTED] http://mail.mozilla.org/listinfo/mozilla-editor
