Thank you.
But what I want to crawl are just from the internent and certainly I can't
control them.


2006/2/10, Vanderdray, Jacob <[EMAIL PROTECTED]>:
>
>        If you control the "temporary links" pages, then just add a
> robots meta tag.  Take a look at
> http://www.robotstxt.org/wc/meta-user.html to see what your options are.
>
> Jake.
>
> -----Original Message-----
> From: Elwin [mailto:[EMAIL PROTECTED]
> Sent: Friday, February 10, 2006 4:38 AM
> To: [email protected]
> Subject: How to control contents to be indexed?
>
> In the process of crawling and indexing, some pages are just used as
> "temporary links " to the pages I want to index, so how can I control
> those
> kinds of pages not being indexed? Or which part of nutch should I
> extend?
>



--
《盖世豪侠》好评如潮,让无线收视居高不下,
无线高兴之余,仍未重用。周星驰岂是池中物,
喜剧天分既然崭露,当然不甘心受冷落,于是
转投电影界,在大银幕上一展风采。无线既得
千里马,又失千里马,当然后悔莫及。

Reply via email to