On 17 February 2012 18:00, Lewis John Mcgibbney <[email protected]>wrote:
> Hi Michele, > > On Sat, Feb 11, 2012 at 5:00 PM, Michele Mostarda < > [email protected]> wrote: > > > The purpose was to help other developers understanding the library > > internals, so any > > feedback and suggestion would be appreciated. > > > > Listen, first and foremost this is really really helpful, so thank you. I > have only one note below: > > 1) {bq}The next phase is performed by the <<Content Validation and > Patching>> module (<<org.apache.any23.validator>>), and it is required > because the most part of data exposed on the Web is affected by minor > issues which compromise the correct working of some <Extractors>.{bq} What are the "minor issues which compromise the correct working of some > extractors"? It would be really great to understand by means of an example > what a typical or boundary case issue might be. > We found the following common issues: org.apache.any23.validator.rule.MissingOpenGraphNamespaceRule: the OpenGraph page does not declare the namespace 'og'. org.apache.any23.validator.rule.AboutNotURIRule : Microformats about tags don't contain a valid URL. org.apache.any23.validator.rule.MetaNameMisuseRule : meta tag properties contain a prefixed value. > > Apart from that, it's certainly given me a much clearer understanding of > how the project fits together so thanks again for linking me with this > contribution it. > Happy to be useful. Bye Mic > > Lewis > -- Michele Mostarda Senior Software Engineer skype: michele.mostarda twitter: micmos mail: [email protected] site : http://www.michelemostarda.com
