So, now in 2015, is it still necessary to block some bots and some
URLs or should everything be opened or should this bug be closed or...?
Just a "ping" :-).
--
Ivan Baldo - iba...@adinet.com.uy - http://ibaldo.codigolibre.net/
From Montevideo, Uruguay, at the south of South America.
Fre
So right now Google is allowed to spider bugs.debian.org, but other
search engines are not. Sounds discriminating.
Perhaps it could be extracted from web server logs to see how much load
does the Googlebot make?
If the numbers are not very significant, other spiders could be allowed,
couldn'
On Wed, 09 Jan 2008, Don Armstrong wrote:
> On Thu, 10 Jan 2008, Anthony Towns wrote:
> > I've made those changes on rietz directly; what's the procedure
> > for committing them? "sudo -u debbugs -H bzr commit" ? There was a
> > pre-existing change in pkgreport.cgi (adding a"^" to the "Go away"
> >
On Thu, 10 Jan 2008, Anthony Towns wrote:
> (In practice, with google barely indexing anything in the BTS yet; lookup
> for bug#459818 by googling for `medium dhclient-script' works fine;
> using hyperstraier on merkel takes ages and doesn't return any hits)
That's because you actually meant to se
On Wed, Jan 09, 2008 at 05:58:34PM +1000, Anthony Towns wrote:
> Disallow: /*/ # exclude everything but the shortcuts
> Allow: /cgi-bin/bugreport.cgi?bug=
> Allow: /cgi-bin/pkgreport.cgi?pkg=*;dist=unstable$
> I've set that up on rietz for Googlebot, we'll see if it works ok
On Wed, Jan 09, 2008 at 12:54:32PM -0800, Don Armstrong wrote:
> On Wed, 09 Jan 2008, Anthony Towns wrote:
> > Uh, you're kidding right? The BTS's own search engine won't turn up hits
> > outside the BTS, as a trivial example...
> It's far superior to google for searching for results *in* the BTS.
On Thu, 10 Jan 2008, Anthony Towns wrote:
> On Wed, Jan 09, 2008 at 05:58:34PM +1000, Anthony Towns wrote:
> > Getting smarturl.cgi properly done is still probably the real solution.
>
> Okay, so I've made smaturl.cgi work again; it was broken by:
>
>- Debbugs::CGI not accepting params from A
On Wed, 09 Jan 2008, Anthony Towns wrote:
> On Thu, Jan 03, 2008 at 01:07:15PM -0800, Don Armstrong wrote:
> > There are already mirrors which allow indexing, and you can use the
> > BTS's own search engine which is far superior to gooogle [...]
>
> Uh, you're kidding right? The BTS's own search e
On Wed, Jan 09, 2008 at 05:58:34PM +1000, Anthony Towns wrote:
> Getting smarturl.cgi properly done is still probably the real solution.
Okay, so I've made smaturl.cgi work again; it was broken by:
- Debbugs::CGI not accepting params from ARGV (smarturl.cgi changed
to set QUERY_STRING)
On Thu, Jan 03, 2008 at 01:07:15PM -0800, Don Armstrong wrote:
> There are already mirrors which allow indexing, and you can use the
> BTS's own search engine which is far superior to gooogle [...]
Uh, you're kidding right? The BTS's own search engine won't turn up hits
outside the BTS, as a trivi
2008/1/3, Don Armstrong <[EMAIL PROTECTED]> wrote:
> On Thu, 03 Jan 2008, Jason Spiro wrote:
> > http://en.wikipedia.org/wiki/Robots.txt#Crawl-delay_directive will
> > help. Yahoo and MSNBot both support it. I bet other major bots
> > support it too. So we can allow Yahoo and MSNBot (plus Googlebot
On Thu, 03 Jan 2008, Jason Spiro wrote:
> Package: www.debian.org
> Severity: wishlist
>
> Please allow search engines to index http://bugs.debian.org. This can
> be done by deleting the file http://bugs.debian.org/robots.txt.
Most of the content is generated dynamically nowadays and this file h
On Thu, 03 Jan 2008, Jason Spiro wrote:
> http://en.wikipedia.org/wiki/Robots.txt#Crawl-delay_directive will
> help. Yahoo and MSNBot both support it. I bet other major bots
> support it too. So we can allow Yahoo and MSNBot (plus Googlebot, if
> they support it too) and block everyone else.
Googl
2008/1/3, Don Armstrong <[EMAIL PROTECTED]> wrote:
>
> On Thu, 03 Jan 2008, Jason Spiro wrote:
>
> > Please allow search engines to index http://bugs.debian.org. This can
> > be done by deleting the file http://bugs.debian.org/robots.txt.
>
> Just for the record, the reasons why we disallow indexi
On Thu, Jan 03, 2008 at 01:07:15PM -0800, Don Armstrong wrote:
> On Thu, 03 Jan 2008, Jason Spiro wrote:
> > Please allow search engines to index http://bugs.debian.org. This can
> > be done by deleting the file http://bugs.debian.org/robots.txt.
>
> Just for the record, the reasons why we disall
On Thu, 03 Jan 2008, Jason Spiro wrote:
> Please allow search engines to index http://bugs.debian.org. This can
> be done by deleting the file http://bugs.debian.org/robots.txt.
Just for the record, the reasons why we disallow indexing are because
the robots.txt specification isn't complete enoug
reassign 458939 bugs.debian.org
thanks
On Thu, Jan 03, 2008 at 07:40:12PM +, Jason Spiro wrote:
> Package: www.debian.org
> Severity: wishlist
>
> Please allow search engines to index http://bugs.debian.org. This can
> be done by deleting the file http://bugs.debian.org/robots.txt.
Hello, t
Package: www.debian.org
Severity: wishlist
Please allow search engines to index http://bugs.debian.org. This can
be done by deleting the file http://bugs.debian.org/robots.txt.
Cheers,
--
Jason Spiro: corporate trainer, web developer, IT consultant.
I support Linux, UNIX, Windows, and more.
Con
18 matches
Mail list logo