aseek-devel  

Re: [aseek-devel] Memory not being released when index is done

Karen Barnes
Wed, 25 Sep 2002 09:06:53 -0700

Hi John,

I've experienced the exact same thing. When running the indexer I can see 
memory continually climb until all memory is consumed. I only have about 
2,000,000 URLs (many just robots.txt) and when I do an index that's what 
happens. When index is done the memory is not released and other programs 
are swaping to disk which ultimately starts thrashing the hard drive. I too 
have 2GB ram, raid 5 and 205GB of free space. Never had this problem using 
the same mysql setup as I have always used (my.cnf) before, but since I 
installed aspseek that's what happens. Here are my memory stats:

r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs
0  0  0   1044  16356  42120 1186860   0   0    29   109  111   102

I run the index like this "./index -N 80"

Top shows this:

  9:20am  up 22:45,  2 users,  load average: 0.28, 0.24, 0.19
288 processes: 287 sleeping, 1 running, 0 zombie, 0 stopped
CPU0 states:  0.1% user,  0.0% system,  0.0% nice, 99.4% idle
CPU1 states:  0.4% user,  0.2% system,  0.0% nice, 98.5% idle
CPU2 states:  0.4% user, 16.3% system,  0.0% nice, 82.4% idle
CPU3 states:  8.2% user, 19.1% system,  0.0% nice, 72.1% idle
Mem:  2065152K av, 2044500K used,   20652K free,       0K shrd,   42584K 
buff
Swap: 2096472K av,    1044K used, 2095428K free                 1181836K 
cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
13679 scrubweb  25   0  309M 309M  1792 S     0.0 15.3   0:17 index
and of course many many more index to follow

The my.cnf looks like this:

# The MySQL server
[mysqld]
user            = mysql
set-variable    = max_connections=512
port            = 3306
socket          = /tmp/mysql.sock
skip-locking
set-variable    = key_buffer=256M
set-variable    = max_allowed_packet=3M
set-variable    = table_cache=256
set-variable    = sort_buffer=1M
set-variable    = net_buffer_length=8K
set-variable    = myisam_sort_buffer_size=64M
log-bin
server-id       = 1

[mysqldump]
quick
set-variable    = max_allowed_packet=16M

[mysql]
no-auto-rehash

[isamchk]
set-variable    = key_buffer=128M
set-variable    = sort_buffer=128M
set-variable    = read_buffer=2M
set-variable    = write_buffer=2M

[myisamchk]
set-variable    = key_buffer=128M
set-variable    = sort_buffer=128M
set-variable    = read_buffer=2M
set-variable    = write_buffer=2M

[mysqlhotcopy]
interactive-timeout

Doing a "free"

             total       used       free     shared    buffers     cached
Mem:       2065152    2047720      17432          0      43224    1184380
-/+ buffers/cache:     820116    1245036
Swap:      2096472       1044    2095428

That's a lot of memory being consumed. This is a standalone box so the only 
thing running right now is index. I have searchd running, but no access. 
Yesterday I rebooted my system to recover the memory. Waited throughout the 
night and when I checked RAM this morning it only showed that 175,345 was 
used. All other programs were running like mysql and even searchd. As soon 
as I started index I could just watch the memory trickle away. By the end of 
the indexing process the system will be thrashing the disk cached RAM and 
will never be released.

Just thought I wold let you know my experience. Seems like this is just how 
things work with index. If I don't spawn off 80 threads it will take all day 
to index 200,000 URLs.

Karen

>Today I started with an empty index and inserted 175,000 URLs:
>
>./index -i -f myurls.txt
>
>Then I ran the indexer:
>
>./index -N 100 -s 0
>
>Consumed a bunch of memory once all was resolved and running. In fact I'm 
>running 2GB DDR 200MHZ RAM on a dual 2.2GHZ Zeon box with Linux running the 
>latest kernel. While running index it will not only consume the entire 2GB, 
>but also about 2,000k of disk swap. Once index is complete none of the RAM 
>was released. The only way I can get it back so other programs were not 
>swaping memory to disk is to do a reboot. This process took 7 hours by the 
>way.
>
>Has anyone experienced the same problem? I could easily index 1,000,000 
>URLs all at once using mysql and a multi-threaded perl indexer with minimum 
>RAM consumption and when done (about 20 hours later) all RAM is released. 
>Maybe it's because I wrote the Perl script? All other programs I have used 
>release the RAM back, but not aspseek's indexer for some reason.
>
>Thanks!
>
>_________________________________________________________________
>MSN Photos is the easiest way to share and print your photos: 
>http://photos.msn.com/support/worldwide.aspx




_________________________________________________________________
Send and receive Hotmail on your mobile device: http://mobile.msn.com