Created issue:
https://issues.apache.org/jira/browse/NUTCH-1262

On Tuesday 31 January 2012 06:58:56 Alexander Aristov wrote:
> Hi
> 
> Of course we all understand that these two types are not the same and serve
> for different purposes but since Nutch doesn't make difference between them
> it would be possible and reasonable to make content-type the same.
> 
> But there are might be some problems. Some nutch users might rely on
> content-type and apply special parser for application/xhtml+xml,
> considering maybe additional namespaces.
> 
> Of course for indexing and searching it replacement would be good.
> 
> 
> in fact there many other examples when content type of different types can
> be treated in the smae way and what if we had a feature of grouping several
> content types into single?
> 
> Best Regards
> Alexander Aristov
> 
> On 30 January 2012 17:12, Markus Jelsma <[email protected]> wrote:
> > Hi,
> > 
> > Should we not provide an optional replace for the content type field in
> > index-
> > more? They are the same for end-users but end up differently in an index.
> > 
> > Thoughts?
> > Thanks

-- 
Markus Jelsma - CTO - Openindex

Reply via email to