Salam,

Kebetulan saya menjadi subscribersnya SearchEngineWatch.com
(not free. US$49.95/year). Minggu lalu ada artikel
menarik yang ditulis oleh Danny Sullivan, pemiliknya.
Artikel tersebut merupakan rangkuman talk yg disampaikannya
di Search Engine Strategies Conference.

Mudah-mudahan ada gunanya.

Cheers,
= Muhamad Syukri =

-------------------------------------------------------------

Submitting To Search Engines & Encouraging Crawlers

In an ideal world, you would never need to submit your web site to
crawler-based search engines. Instead, they would automatically come to
your site, locate all of your pages by following links, then list each of
these pages. That doesn't mean you would rank well for every important
term, but at least all the content within your web site would be fully
represented. Think of it like playing a lottery. Each page represented in a
search engine is like a ticket in the lottery. The more tickets you have,
the more likely you are to win something.

In the real world, search engines miss pages. There are several reasons why
this may happen. For instance, consider a brand new web site. If no one
links to this web site, then search engines may not locate it during their
normal crawls of the web. The site essentially remains invisible to them.

This is why Add URL forms exist. Search engines operate them so they can be
notified of pages that should be considered for indexing. Submitting via
Add URL doesn't mean a page will automatically be listed, but it does bring
the page to the search engine's attention.

Add URL forms have long been an important tool for webmasters looking to
increase their representation through "deep submitting," which I'll cover
below. But it is also important that you consider site architectural
changes that can encourage "deep crawling." These architectural changes
should keep your site better represented, in the long run.

Deep Submitting

In the past, there was a strong relationship between using Add URL forms
and pages getting listed. Pages submitted via Add URL would tend to get
listed and listed more quickly than pages that were not submitted using the
forms. For this reason, people often did "deep submits." They would submit
many pages at the same time, hoping to get them all listed quickly. In
fact, Go (Infoseek) even had a system where you could email thousands of
URLs, all of which would be added within a week.

Those days are essentially gone. In my opinion, there is very little value
to most people spending much time on deep submits. That is because many
search engines have altered the behavior of their Add URL forms in response
to spamming attempts.

Go is a good example of this. Until last November, any page submitted to Go
via the Add URL form would appear within a day or so. Now, only "root" URLs
are accepted. For instance, if you submitted all these URLs:
http://mysite.com/page1.html
http://mysite.com/page2.html
http://mysite.com/section/page2.html
Go would simply shorten them to this core URL:
http://mysite.com

It would then visit that URL and follow links from it to other pages,
deciding on its own what to gather.

Deep submits can still be effective in some places. For instance, AltaVista
will list any page submitted within a day or two. Therefore, submitting to

AltaVista directly can increase the representation or freshness of your
listings. However, AltaVista also considers excessive submission to be
spamming. Submit too many pages per day, and you may find yourself locked
out of the Add URL form. Even worse, you might find all your pages removed.

AltaVista doesn't publish a submission limit, but staying under five pages
per day is a good rule of thumb.

Inktomi is another place where deep submits still seem effective. In recent
weeks, I've noticed that submissions made to Inktomi via HotBot's Add URL
form have appeared within two weeks, if not sooner. In fact, Inktomi has
suggested that pages submitted using Add URL will be "tested" within its
index for a short period of time. If the pages seem to satisfy queries,
then they may be retained. As for limits, HotBot will allow you to add up
to 50 pages per day.

Excite has suggested that it, too, will operate a system similar to
Inktomi's, where submitted pages will be tested for a short period of time.
However, I've seen no evidence of this actually happening. For that reason,

I don't suggest wasting your time on a deep submit to Excite, which allows
the submission of 25 URLs per week, per web site.

Finally, Lycos is probably the last place you might want to be concerned
about doing a deep submit. It has always shown a tendency to more likely
list pages submitted to it. Lycos has no submission limits, but I would
suggest staying under 50 URLs per day.

Encouraging Deep Crawling

It's important to remember that even if you don't submit each and every one
of your pages, search engines may still add some of them any way. Crawlers
follow links -- if you have good internal linking within the pages of your
web site, then you increase the odds that even pages you've never submitted
may still get listed.

In fact, some search engines routinely do "deep crawls" of web sites. None
of them will list all of your pages, but they will gather a good amount
beyond those you actually submit. Currently, deep crawlers are AltaVista,
Inktomi, FAST and Northern Light. And even non-deep crawlers will still
tend to gather some pages beyond those actually submitted, assuming they
find links to these pages from somewhere within your site.

However, even the best of the deep crawlers will have problems with large
web sites. This is because crawlers try to be "polite" when they visit
sites and not request so many pages that they might overwhelm a web server.

For instance, they might request a page every 30 seconds over the course of
an hour. Obviously, this won't allow them to view many pages. Other
crawlers are simply not interested in gathering every single page you have.

They'll get a good chunk, then move on to other sites.
For this reason, you might want to consider breaking up your site into
smaller web sites. For instance, consider a typical shopping site that
might have sections like this:
http://site.com/
http://site.com/books/
http://site.com/movies/
http://site.com/music/

The first URL is the home page, which talks about books, movies and music
available within this site. The second URL is the book section, which
contains information about all the books on sale. The third URL is the
movie section, and the fourth is the music section.

Now imagine that three main sections have 500 pages of product information
each. Altogether, that gives the site about 1,500 pages available for
spidering. Next, let's assume that the best deep crawler tends to only pick
up about 200 pages from each site it visits -- this number is completely
made up, but it will serve to illustrate the point. This would mean that
only 250 pages out of 1,500 pages are spidered, or 17 percent of all those
available.

Now it is time to consider subdomains. Any domain that you register, such
as "site.com," can have an endless number of "subdomains" that make use of
the core domain. All you do is add a word to the left of the core domain,
separated by a dot, such as "subdomain.site.com." These subdomains can then
be used as the web addresses of additional web sites. So returning to our
example, let's say we create three subdomains and use them as the addresses
of three new web sites, as so:
http://books.site.com
http://movies.site.com
http://music.site.com

Now we move all the book content from our "old" web site into the new
"books.site.com" site, doing the same thing for our movies and music
content. Each site stands independently of each other. That means when our
deep crawler comes, it gathers up 250 pages from one site, moves to the
next to gather another 250, then does the same thing with the third. In
all, 750 pages of 1,500 are gathered -- 50 percent of all those available.
That's a huge increase over the 17 percent that were gathered when you
operated one big web site.

Root Page Advantage

The change also gives you more "root" pages, which tend to be more highly
ranked than any other page you will have. The root page is whatever page
appears when you just enter the domain name of a site. Usually, this is the
same as your home page. For instance, if you enter this into your browser:
searchenginewatch.com

The page that loads is both the Search Engine Watch home page and the
"root" page for the Search Engine Watch web server. However, if you have a
site within someone else's web server, such as like this...
http://www.server.com/mysite/
...then your home page is not also the root page. That's because the server
has only one root page, whatever loads when you enter "server.com" into
your browser.

So in our example, there used to be only one root page, that which appeared
when someone went to "site.com," and this page had to be focused around all
different product terms. Now, each of the new sites also has a root page --
and each page can be specifically about a particular product type.

Breaking up a large site might also help you with directories. Editors tend
to prefer listing root URLs rather than long addresses that lead to pages
buried within a web site. So to some degree, breaking up your site into
separate sites should give each site more respect.

Some Final Words

If you decide to go the subdomain route, you'll need to talk with your
server administrator about establishing the new domains. There is no
registration fee involved, but the server company might charge a small
administrative fee to establish the new addresses. Of course, you may also
have to pay an additional monthly charge for each site you operate.

You could also register entirely new domains. However, I suggest subdomains
for a variety of reasons. First, there's no registration fee to pay.
Second, it's nice to see the branding of your core URL replicated in the
subdomains. Finally, search engines have seemed to treat subdomains with as
much respect as completely different domains. Given this, I see no major
reason to register new domains.

-------------------------------------------------------------------------
--
IndoBanner Exchange (http://www.indobanner.com)
PT. IndoBanner Media Pratama
Hotlines : (0812)-94-14-574, Phone : (021)-8297745
Fax : (021)-831-4766
Support : [EMAIL PROTECTED]


- Kirim bunga untuk handaitaulan & relasi di jakarta www.indokado.com 
-- Situs sulap pertama di Indonesia http://www.impact.or.id/dmc-sulap/
To unsubscribe, e-mail : [EMAIL PROTECTED]
To subscribe, e-mail   : [EMAIL PROTECTED]
Netika BerInternet     : [EMAIL PROTECTED]

Kirim email ke