On 17 February 2012 18:00, Lewis John Mcgibbney
<[email protected]>wrote:

> Hi Michele,
>
> On Sat, Feb 11, 2012 at 5:00 PM, Michele Mostarda <
> [email protected]> wrote:
>
> > The purpose was to help other developers understanding the library
> > internals, so any
> > feedback and suggestion would be appreciated.
> >
> > Listen, first and foremost this is really really helpful, so thank you. I
> have only one note below:
>
> 1) {bq}The next phase is performed by the <<Content Validation and
> Patching>> module (<<org.apache.any23.validator>>), and it is required
> because the most part of data exposed on the Web is affected by minor
> issues which compromise the correct working of some <Extractors>.{bq}

What are the "minor issues which compromise the correct working of some
> extractors"? It would be really great to understand by means of an example
> what a typical or boundary case issue might be.
>

We found the following common issues:

org.apache.any23.validator.rule.MissingOpenGraphNamespaceRule: the
OpenGraph page does not declare the namespace 'og'.
org.apache.any23.validator.rule.AboutNotURIRule : Microformats about tags
don't contain a valid URL.
org.apache.any23.validator.rule.MetaNameMisuseRule : meta tag properties
contain a prefixed value.


>
> Apart from that, it's certainly given me a much clearer understanding of
> how the project fits together so thanks again for linking me with this
> contribution it.
>

Happy to be useful.

Bye
Mic

>
> Lewis
>



-- 
Michele Mostarda
Senior Software Engineer
skype: michele.mostarda
twitter: micmos
mail: [email protected]
site : http://www.michelemostarda.com

Reply via email to