Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-15 Thread Zenon Panoussis



Zenon Panoussis skrev:
> 

> Oops. Something else is not OK:
 
> cache.c:687:87: warning: #ifdef with no argument
[etc]

I think that the mailer is responsible for this. There are 
lots of broken lines in the code, that shouldn't be broken. 
Perhaps it's better to attach the file in .gz format instead 
of text.

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-15 Thread Zenon Panoussis



Alexander Barkov skrev:
> 

> We finally found a bug in cache.c. New version is in attachement.
> Everybody who has problems with splitter's crashes are welcome to test.
> Please, give feedback!

Oops. Something else is not OK: 

cache.c:687:87: warning: #ifdef with no argument
cache.c:692:87: warning: #ifdef with no argument
cache.c:697:87: warning: #ifdef with no argument
cache.c:702:87: warning: #ifdef with no argument
cache.c: In function `UdmFindCache':
cache.c:969: parse error before `?'
cache.c:982: `real_num' undeclared (first use in this function)
cache.c:982: (Each undeclared identifier is reported only once
cache.c:982: for each function it appears in.)
cache.c:994: `fd1' undeclared (first use in this function)
cache.c:996: `group' undeclared (first use in this function)
cache.c:1000: `group_num' undeclared (first use in this function)
cache.c: At top level:
cache.c:1011: initializer element is not constant
cache.c:1011: warning: data definition has no type or storage class
cache.c:1012: parse error before string constant
cache.c:1013: parse error before string constant
cache.c:1013: warning: data definition has no type or storage class
cache.c:1014: redefinition of `ticks'
cache.c:1011: `ticks' previously defined here
cache.c:1014: initializer element is not constant
cache.c:1014: warning: data definition has no type or storage class
cache.c:1015: parse error before string constant
cache.c:1015: warning: data definition has no type or storage class
cache.c:1024: `i' undeclared here (not in a function)
cache.c:1024: parse error before `.'
cache.c:1030: register name not specified for `p'
cache.c:1032: parse error before `if'
cache.c:1035: `pmerg' undeclared here (not in a function)
cache.c:1035: `pmerg' undeclared here (not in a function)
cache.c:1035: warning: data definition has no type or storage class
cache.c:1036: parse error before `&'
cache.c:1043: `k' undeclared here (not in a function)
cache.c:1043: warning: data definition has no type or storage class
cache.c:1044: parse error before `}'
cache.c:1046: conflicting types for `p'
cache.c:1030: previous declaration of `p'
cache.c:1046: `pmerg' undeclared here (not in a function)
cache.c:1046: warning: data definition has no type or storage class
cache.c:1047: parse error before `&'
cache.c:1048: parse error before `->'
cache.c:1058: warning: initialization makes integer from pointer without
a cast
cache.c:1058: warning: data definition has no type or storage class
cache.c:1058: parse error before `}'
cache.c:1061: redefinition of `ticks'
cache.c:1014: `ticks' previously defined here
cache.c:1061: initializer element is not constant
cache.c:1061: warning: data definition has no type or storage class
cache.c:1063: parse error before string constant
cache.c:1071: warning: parameter names (without types) in function
declaration
cache.c:1071: conflicting types for `UdmGroupByURL'
../include/udm_searchtool.h:7: previous declaration of `UdmGroupByURL'
cache.c:1071: warning: data definition has no type or storage class
cache.c:1072: parse error before `}'
make[1]: *** [cache.lo] Error 1
make[1]: Leaving directory `/root/mnogosearch-3.1.10/src'
make: *** [all-recursive] Error 1


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-15 Thread Zenon Panoussis



Alexander Barkov skrev:
> 

> We finally found a bug in cache.c. New version is in attachement.
> Everybody who has problems with splitter's crashes are welcome to test.
> Please, give feedback!

You guys are great! I'll re-compile and get back to you with 
reports. 

BTW, can I remove http://search.freewinds.cx/garbage_in_sbin.tar.gz 
now? 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Indexer still runs but search.cgi does not and a small story about the problems

2001-02-15 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> I have tried 600, which is fine for the indexer and I have tried 
> even 777 but it did not made any difference. After all search 
> (and search.cgi) can read it via ssh but it fails via browser ...

Oops - error. When you access it via ssh, it is user "wendibus" 
reading the file, while if you access it via the browser it is 
user "nobody" reading it. If the permissions of search.htm 
would be -rw--- (and everything else OK) you would get that 
precise effect. Search.htm should be -rw-r--r-- so that "nobody" 
can read it. 

In your particular case the problem is elsewhere (and the thing 
didn't work with search.htm in -rwxrwxrwx mode either), but 
anybody else reading the webboard should keep this in mind. 

Z



Reply: <http://search.mnogo.ru/board/message.php?id=1427>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: no files found in mirror directories

2001-02-15 Thread Zenon Panoussis



Caffeinate The World skrev:
> 
 
> i have indexer going but i see nothing in the mirror directories. when
> does it store the pages to the mirror directory?

If your pages are already indexed, when you re-index with -a 
indexer will check the headers and only download files that 
have been modified since the last indexing. Thus, all pages 
that are not modified will not be dowloaded and therefore not 
mirrored either. To create the mirror you need to either 
(a) start again with a clean database or (b) use the -m switch. 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Setting up

2001-02-15 Thread Zenon Panoussis


 
> Basically, my UNIX/LINUX knowledge is non-existant. I've FTP'd the tar file
> to our LINUX box, and have extracted it... but where to go from there??? I
> haven't got a clue.

If that is so, you are bound to run into problems all the time. 
Perhaps installing a search engine is not the first thing you 
should do as basic Linux training. 

Anyway, I'll try to help you a bit on the way. In the following 
I assume that you are logged is as root. If you don't have root 
access to your machine, more steps are needed, marked with [*]. 
In such case, ask again.

- Install your database (mysql or whatever you are going to use).
  Make sure it is working.
- Download the .tar.gz file on your home directory, e.g. /root 
- Unpack it with #tar zxvf mnogosearch-3.1.10.tar.gz 
- Move into the newly created directory with #cd mnogosearch-3.1.10
- [*]
- Prepare the source with #./configure --with-[your_database]
- Compile with #make && make install
- You will now have a new directory called /usr/local/mnogosearch 
  Go there with #cd /usr/local/mnogosearch/etc 
- Create a new database and tables in it. Since I don't know 
  what database you are using, I can't help you here. Note that 
  you have to take at least two steps: (1) create the database, 
  (2) create the sql tables in it. Two additional steps that 
  are highly recommended but not necessary are to (3) add the 
  stopword tables of your choice and (4) create a new user on 
  the database, so that indexer and search don't run as root.
- Edit indexer.conf with #vi indexer.conf (this is torture: 
  now you have to learn vi while trying to install the search. 
  I'm sorry to tell you, vi is a bitch. Anyway, use the arrows 
  to move around. Use i to enter insert mode. Use [esc] to exit 
  insert mode. Use [esc]:x[enter] to save and exit with these 
  four things you should be able to edit a file).

  If you edit the following items to correspond to your settings 
  you will have a minimal working configuration:
   - DBAddr  (example: mysql://username:password@machine/database/ )
   - DBMode  (example: single)
   - LocalCharset (example: LocalCharset iso-8859-1 )
   - Server (example: Server path http://www.domain.dom/path )
  For DBMode and LocalCharset all you need to do is uncomment 
  the right line (remove the "#" is front of it). Save and exit.
- Copy search.htm-dist to search.htm with the command 
  #cp search.htm-dist search.htm
- Edit search.htm with #vi search.htm . Adjust DBAddr, DBMode and 
  LocalCharset to the exact same settings as in indexer.conf. 
  Don't touch anything else. Save and exit.
- [*]
- Find where the cgi-bin directory of your webserver is. If you 
  are running apache without virtual domains it will be in 
  /var/www/cgi-bin or /vol/www/cgi-bin or something similar. 
  If you are working on your ISP's machine, ask the ISP. 
- Copy search.cgi to the cgi-bin directory. Assuming that you 
  are still in /usr/local/mnogosearch/etc , you do that with 
  #cp ../bin/search.cgi /[full_path_to/cgi-bin/search.cgi . 
- [*]
- Try to access search.cgi with your browser. Go to 
  http://www.your_domain.dom/cgi-bin/search.cgi . If you get 
  a search box, you have come a long way. If not, you need 
  more help. When asking for it, describe exactly what you 
  did, how you did it, what worked or did not and what 
  error messages you are getting.
- Change to the sbin directory of the mnogo installation. 
  Assuming that you are still in /usr/local/mnogosearch/etc , 
  do #cd ../sbin 
- Start indexer with #./indexer -a -c 300  . This will cause 
  indexer to run for 5 minutes and stop. Go back to your 
  browser and search for a word that you know occurs in the 
  files that you just saw indexer index. If you get results, 
  you have come a very long way. I that case you can 
- Re-start indexer with #./indexer  and let it finish its job. 

Good luck.

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Bug report

2001-02-15 Thread Zenon Panoussis



Bernd Schulze skrev:
> 

> Files that consist of html code get a
> correct title entry in the database.
> Other document types (and we have 90%
> pdf) (that is files that do not have the
> possibility of a title tag) get assigned
> the last title that has been successfully
> found in a tag.

I've had this problem with .txt files and v 3.1.8/mysql. After 
indexing everything, I removed the .txt files from the index 
(indexer -C -u %.txt) and re-indexed them (indexer -a -u %.txt). 
Somehow that fixed the problem and all .txt files got the 
correct "No title" title. 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-14 Thread Zenon Panoussis



Zenon Panoussis skrev:
> 

> By now, I have almost 1 GB of indexed files, 4 indexer
> crashes and one splitter crash. I'll do the debugging and
> post its output tomorrow.

===
# gdb indexer core.indexer.01
GNU gdb 5.0
Copyright 2000 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i386-redhat-linux"...
Core was generated by `./indexer -m -s 200'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/mysql/libmysqlclient.so.10...done.
Loaded symbols for /usr/lib/mysql/libmysqlclient.so.10
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /lib/libnss_nisplus.so.2...done.
Loaded symbols for /lib/libnss_nisplus.so.2
Reading symbols from /lib/libnss_nis.so.2...done.
Loaded symbols for /lib/libnss_nis.so.2
Reading symbols from /lib/libnss_dns.so.2...done.
Loaded symbols for /lib/libnss_dns.so.2
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
#0  0x805e5fa in UdmCRC32 (buf=0x4021b03e "", size=4294967295) at
crc32.c:97
97  _CRC32_(crc, *p) ;
(gdb) print crc
$1 = 1928826335
(gdb) print p
$2 = 0x40431000 

===

# gdb indexer core.indexer.02

#0  0x805e5fa in UdmCRC32 (buf=0x4021b03e "", size=4294967295) at
crc32.c:97
97  _CRC32_(crc, *p) ;
(gdb) print crc
$1 = 835566978
(gdb) print p
$2 = 0x40404000 

===

# gdb indexer core.indexer.03

#0  0x805e5fa in UdmCRC32 (buf=0x4021b03e "", size=4294967295) at
crc32.c:97
97  _CRC32_(crc, *p) ;
(gdb) print crc
$1 = 2869617068
(gdb) print p
$2 = 0x40404000 

===

# gdb indexer core.indexer.04

(gdb) print crc
$1 = 1253677059
(gdb) print p
$2 = 0x40431000 

===

And finally the splitter:

# gdb splitter core.splitter.01 

This GDB was configured as "i386-redhat-linux"...
Core was generated by `/usr/local/mnogo3110/sbin/splitter'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/mysql/libmysqlclient.so.10...done.
Loaded symbols for /usr/lib/mysql/libmysqlclient.so.10
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x8057d15 in UdmSplitCacheLog (log=118) at cache.c:635
635
logwords[count+j].wrd_id=table[w].wrd_id;
(gdb) print count
$1 = 13121220
(gdb) print count+j
$2 = 13125316
(gdb) print logwords
$3 = (UDM_LOGWORD *) 0x0
(gdb) print table[w]
$4 = {wrd_id = 1918989871, weight = 1869507887, pos = 825454439, len =
1949249585}
(gdb) print logwords[count+j]
Cannot access memory at address 0x15e7bd70

===

This time I'm keeping the core dumps, so let me know if there's 
anything else you want me to check.

Apart from this, I got some garbage directories with misnamed 
splitter files in them in sbin:

# pwd
/usr/local/mnogo3110/sbin
# ls -l

-rw-r--r--1 root root   457672 Feb 13 08:28 àË???
drwxr-xr-x3 root root 4096 Feb 13 08:28 àË???3F
-rw-r--r--1 root root   487224 Feb 13 08:27 æmEhttp://search.freewinds.cx/garbage_in_sbin.tar.gz

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Full on search engine?

2001-02-14 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

Depends on what you mean. Follow the links at 
http://search.mnogo.ru/users.html and see what it can do.

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1392>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-14 Thread Zenon Panoussis



Caffeinate The World skrev:
> 

> i've been going through this and back again time and time again. what
> would really be nice is indexer save the logs in a format that's easy
> to use again. for instance, you can use the format re-index to sql etc.
 
> or if you want to reindex again, you don't have to crawl through all
> the external websites. saves a lot of time and we can debug faster.

I'm not sure what you mean here. The Mirror statement does just that 
(and luckily, I had an almost complete mirror already). 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-13 Thread Zenon Panoussis



Zenon Panoussis skrev:
> 
 
> Now for 31 MB adventures :)

# ./run-splitter -k
Sending -HUP signal to cachelogd...
Done
# ./run-splitter -p
Preparing logs...
Open dir '/var/mnogo3110/raw'
Preparing word log 982024900  [   42176 bytes]
Preparing word log 982027284  [31465324 bytes]
Preparing word log 982027618  [ 8815804 bytes]
Preparing del log 982024900  
Preparing del log 982027284
Preparing del log 982027618
Renaming logs...
Done

Running ./run-splitter on these worked fine. No problems at all. 
After that, I went on indexing and created 

59920 Feb 13 06:05 982040748.del.done
 31457740 Feb 13 06:05 982040748.wrd.done
 1480 Feb 13 06:06 982040807.del.done
   637240 Feb 13 06:06 982040807.wrd.done
51920 Feb 13 07:21 982045300.del.done
 31469304 Feb 13 07:21 982045300.wrd.done
69248 Feb 13 07:51 982047843.del.done
 30213344 Feb 13 07:51 982047843.wrd.done

another two 31 MB files and two smaller ones. All of them were 
splitted without problems.

[two days later] 

Indexing kept crashing (see separate posting) and splitting 
kept going fine until tonight, when the opposite occured. 
By now, I have almost 1 GB of indexed files, 4 indexer 
crashes and one splitter crash. I'll do the debugging and 
post its output tomorrow. 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: This works

2001-02-13 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
 
> this works, works ;-))
 
> I do have no right to write to the main cgi-bin of the server. 
> But I do have the right to install cgis in the user dirs. 

I begin to suspect that you are confusing mnogosearch/bin with 
cgi-bin. Can you give me the directory structure of *your* file 
area? Like 
   basedir
   basedir/mnogo
   basedir/www
etc. 

Now, if the little perl script I gave you worked, copy search.cgi 
to the same place and access it in the same way. Whatever you did 
to get that perl script to work: do the same with search.cgi. Do 
not recompile or anything, just by copy it where you want it.

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1376>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: indexer runs and search.cgi does not

2001-02-13 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
 
> Question: are you running your own web server? Is the cgi-bin of 
> your particular domain and user account *really* set to the cgi-bin 
> directory that you are using? Are you sure? 

If you don't know how the web server is configured, here is how 
to test it. 

Put this in a file called test.pl :

#!/usr/bin/perl -w
use CGI ':standard';

print header();
print start_html();
print h5("This works");
print end_html();

Do chmod a+x test.pl and place the file in the same directory 
as your search.cgi . Do ./test.pl on the shell; that should 
give you a simple HTML page. Now call the script from your 
browser with http://your.domain/cgi-bin/test.pl or with 
http://your.domain/your_dir/cgi-bin/test.pl . Does it work? 
If not, your problem is in the server configuration and the 
location of your cgi-bin. 

Z



Reply: <http://search.mnogo.ru/board/message.php?id=1372>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Stupid Question about the host

2001-02-13 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> I have another stupid question. Is it possible that I have to call 
> the configure script with an option for the host type?
 
> I have installed it via ssh on the host but have not used any 
> stuff there... and the database is on a different server then 
> the script is. 

It needs to be configured for the machine where indexer runs, 
not the machine where mysql and/or the databases reside. If 
you compiled it on the same machine as the one indexer runs 
on, it should be fine. Besides, we already know that search.cgi 
works from the shell, so it can't be a platform problem you 
are having. 

Just as general information though, configure does allow you 
to compile for different machines. Do ./configure --help and 
check the "Host type" section. 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1371>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: indexer runs and search.cgi does not

2001-02-13 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> If I move search.htm the script complains that it cant find 
> search.htm via the ssh. 

Well that's good. It shows that the cgi is looking in the right 
place for the right file.

 
> maybe it is any kind of help I have called the script like this 
> configure --with-mysql --prefix=/mylocal/dir/

That should be OK.


> it does not matter if I put it into a local dir or in a local 
> cgi-bin/dir it never works ... 

"A local cgi-bin/dir"? What do you mean? 

Question: are you running your own web server? Is the cgi-bin of 
your particular domain and user account *really* set to the cgi-bin 
directory that you are using? Are you sure? 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1370>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: indexer runs and search.cgi does not

2001-02-13 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

Grrr. OK, try this then: Make a static HTML page with this form in it:
 
  
 Search for:  
  
  
 
Try it and tell me what you get. Somehow we need to figure 
whether the error is in search.htm, file permissions, the 
web server configuration or yet something else. 

BTW, do you have a 100% standard installation? No recompilations 
after you copied search.cgi in cgi-bin, no particular modifications 
to anything? 

Z



Reply: <http://search.mnogo.ru/board/message.php?id=1369>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Other segfault

2001-02-12 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
[3.1.10, RH 7.0 on PII, mysql-3.23.29-1, cache mode]

While trying to reproduce the splitter segfault, I got a segfault 
from indexer. I don't remember this ever happening before and I've 
been using mnogosearch since the early days of 3.1.7. The way things 
have been lately I would start questionning my RAM, but everything 
else on the machine runs fine, so it can't be that.

OK, the debug:

# gdb indexer core
GNU gdb 5.0

This GDB was configured as "i386-redhat-linux"...
Core was generated by `./indexer -m -s 200'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/mysql/libmysqlclient.so.10...done.
Loaded symbols for /usr/lib/mysql/libmysqlclient.so.10
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /lib/libnss_nisplus.so.2...done.
Loaded symbols for /lib/libnss_nisplus.so.2
Reading symbols from /lib/libnss_nis.so.2...done.
Loaded symbols for /lib/libnss_nis.so.2
Reading symbols from /lib/libnss_dns.so.2...done.
Loaded symbols for /lib/libnss_dns.so.2
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
#0  0x805e5fa in UdmCRC32 (buf=0x4021b03e "", size=4294967295) at crc32.c:97
97  _CRC32_(crc, *p) ;
(gdb) backtrace
#0  0x805e5fa in UdmCRC32 (buf=0x4021b03e "", size=4294967295) at crc32.c:97
#1  0x804d768 in UdmIndexNextURL (Indexer=0x808d308, index_flags=5) at indexer.c:1145
#2  0x804a020 in thread_main (arg=0x0) at main.c:256
#3  0x804a9b0 in main (argc=4, argv=0xbaa4) at main.c:596
#4  0x4009bbfc in __libc_start_main (main=0x804a13c , argc=4, ubp_av=0xbaa4, 
init=0x8049684 <_init>, fini=0x8068bec <_fini>, rtld_fini=0x4000d674 <_dl_fini>, 
stack_end=0xba9c)
at ../sysdeps/generic/libc-start.c:118

This time I'm keeping core. Just tell me if you want me to run gdm on anything else 
and how to do that.

In any case I'd suggest that you don't bother with this now. Indexer has been working 
so well so far, that we probably can ascribe this to pure bad luck. If it happens 
again I'll let you know.

Z



Reply: <http://search.mnogo.ru/board/message.php?id=1350>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: splitter -p does not rename logs

2001-02-12 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
v 3.1.10 in cache mode:

splitter -p should rename the n.del and .wrd logs to 
.done (cachemode.txt, 4B). It did until v 3.1.9, but doesn't 
any more. run-splitter -p does though. 

Z



Reply: <http://search.mnogo.ru/board/message.php?id=1349>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: indexer runs and search.cgi does not

2001-02-12 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> thanks for the hint. It looks like this iExplorer 5.5 believes 
> the above mentioned code. Opera 4.0 says that the page is just 
> empty. 

> the code from the ssh session looks quite fine to me. Here it 
> is. 

OK, we're getting closer. 

Your code says . This results in 
no action at all. Check your search.htm: it should say 
 . 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1347>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-12 Thread Zenon Panoussis



Alexander Barkov skrev:
> 

> Could you check count, j, w, table[w], logwords[count+j]
> variable values? Use print gdb command.

AAARGH! I deleted the core dump. I didn't know that I could do 
that :(  

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: This is SHITE!!!

2001-02-12 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> After spending nearly 3 Days trying to get this thing to work, I 
> have come to the conclusion that it is a waste of time and a 
> JOKE:-(..)

In that case you are entitled to your money back. Every penny 
of it.


> The documentation is poor and the support I am getting from this 
> board is daft. 

Really? Let's see: 

  The query is OKAY except it is expecting some input. 
  Question: 
  Do the inputs need to be supplied during indexing? 

  My users supply the inputs when they are searching. 
  At indexing time, I have no idea what inputs they will 
  supply. 

What input? If the users supply input during *searching*, you 
can't very well expect that that input will be used during 
*indexing*, can you? Indexing is supposed to take place 
*before* searching. 

And also:

  Can this web board be searched? I hope it is not some 
  kinda strategy for this site to get more hits. 

(insults in the very question)

Alexander:

  There is a link from main page to the site search. 
  Webboard is indexed too among static documents and mailing 
  list archive. 

You:

  Where is it? I cannot find it. 

Your question was answered. If you can't find the main page of 
this site, you shouldn't be installing databases. If you can't 
find the search link on the main page, you shouldn't be near 
computers at all.


> Does anyone else no of any alternative? If so 
> please let me know. 

No alternatives will help you. You need to start at the basics. 
"How to find a link on a web page" etc.


> See my postings below to see problems I have been having and 
> the replies  i get and you will see why I am feeling this way.

The only reason I bother to reply to any of this is that the 
developers have put a tremendous amount of work into something 
they provide to you for free, and all you know to do is 
(a) pose incomprehensible questions, (b) pose stupid questions 
and (c) post insults. 

I wish you luck with your computercontractor.net . Perhaps one 
day you can use it to find someone with more skills and less 
arrogance than yourself.

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1343>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Parameters...

2001-02-12 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> I have built my index using this:
 
> HTDBDoc \
> SELECT concat( \
> [etc]
> FROM jobsadvertised \
> WHERE job_id='$1' and to_days(now()) - to_days(job_inp_dte) <= '$2' and site_type 
>= '$4' and job_location = '$3' and job_type = '$5'


> How do I pass the required values to indexer from my browser. 
> I.e, how will it know what $1 is etc...

This means that you have created your tables and that you are 
not using search.cgi, but are writing your own search scripts 
too. Well then you should look at the Perl DBI module and try 
something like 

***
#!/usr/bin/perl -w
use strict;
use DBI;
use CGI ':standard';
use CGI::Carp 'fatalsToBrowser';
$CGI::POST_MAX=300;

my ($query) = @_;
import_names('R');

print header();
print [insert your HTML here]
my $sth = $dbh->prepare("SELECT \"%$R::job%\"  FROM \"$R::table\"  WHERE parameter = 
\"$R::search_term\"  AND other_parameter LIKE \"%$R::other_term%\" GROUP BY 
what_you_want");
$sth->execute();
while (my $ref = $sth->fetchrow_hashref()) {
  print "$ref->{'job'}","\n";
}
print [more HTML]
$sth->finish();
$dbh->disconnect();
print end_html();

***

You call this jobs.cgi. Then you create a search form in plain 
HTML with action=/cgi-bin/jobs.cgi where you name the input 
fields. The names that you have given to those input fields 
will be passed to the script in the form R::name and go straight 
where you want them.

Note that the above example is just a very quick adaptation of 
something I had ready, so it will most probably not work as it 
is. In any case, for this kind I things you might be screaming 
in wrong forum. You'd probably be better off asking questions 
in the apropriate database and/or perl module mailing lists.

Z



Reply: <http://search.mnogo.ru/board/message.php?id=1341>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: indexer runs and search.cgi does not

2001-02-12 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

The page is still empty. The actual code is: 
 
> > !DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">; 
> >  
> >  

> So there is no error message. When I go to the search.cgi dir and call search.cgi 
>via ssh I still receive a quite good looking HTML-Code on the stdout... 

Hmm. I'm beginning to wonder: could it be that charset tag? What 
OS and browser are you using to look at the page with? 

Try this:
1. Copy/paste the html from ssh onto a static page on the same 
   server. Access that page with your browser. What do you see? 
2. Remove the charset tag from the static page. Try again. What 
   do you see? 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1338>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: indexer -m -e -n 1000

2001-02-12 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

Is it easy to implement? With big databases it would allow 
forced re-indexing from the bottom up in a controlled manner. 

As it is now, if you just do indexer -m the indexer will run 
away and do the entire URL list in the database. If you do 
indexer -m -c n and then repeat it, the indexer will take the 
same documents in both runs. And if you don't know which 
documents are oldest, you can't use indexer -m -u pattern 
(which would also be very tedious if you have a huge list of 
URLs). 

Z



Reply: <http://search.mnogo.ru/board/message.php?id=1339>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: run-splitter

2001-02-12 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

run-splitter does not obey --localstatedir . If you configure 
with --prefix=/usr/local/mnogo3110 --localstatedir=/var/mnogo3110 ,
run-splitter comes out as 

  PREFIX=/usr/local/mnogo3110

  VAR=$PREFIX/var
  SBIN=$PREFIX/sbin
  PID=$VAR/cachelogd.pid
  SPLITTER=$SBIN/splitter

It's only cosmetic, but should be easy to fix. 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1333>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-12 Thread Zenon Panoussis



Zenon Panoussis skrev:
> 

> I'll delete the entire tree directory and start re-indexing from
> scratch. I'll make and split a small file first, ca 5 MB, then a
> 31 MB file, if that works yet another 31 MB file, and so on until
> I get in problems again. Will report back later this evening.

First step OK: 

- indexed for a while, created 2.8 MB log file
- split successfully and even got the FFF directory:
  
  /var/mnogo3110/tree/FF/F/FFFE6000 old:   0 new:   2 total:   2
  /var/mnogo3110/tree/FF/F/FFFE7000 old:   0 new:  24 total:  24

Now for 31 MB adventures :)

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: single mode works, cache mode not.

2001-02-12 Thread Zenon Panoussis



Fredy Kuenzler skrev:
> 

> It seems to me, that cache mode in 3.1.9 and 3.1.10 does not work
> good. Indexer (according to the /doc) works in cache mode and
> single mode, however search.cgi does not find anything in cache
> mode. In single mode everything works as expected.

It's a whole series of things. Your sql tables must be in single 
mode (i.e. created with create.txt only) and you must have set 
cache mode in both indexer.conf and search.htm and you must have 
run cachelogd, indexer, splitter -p and splitter in the right way 
in the right order. Is all that OK?

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-12 Thread Zenon Panoussis


Zenon Panoussis skrev:
> 

>And a really HARD hang at the same place as before. So hard
>that I can't even kill splitter.

BTW, although I couldn't kill splitter, I did find a core dump 
in sbin. Here's the backtrace:


# gdb splitter core
GNU gdb 5.0

This GDB was configured as "i386-redhat-linux"...
Core was generated by `./splitter'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/mysql/libmysqlclient.so.10...done.
Loaded symbols for /usr/lib/mysql/libmysqlclient.so.10
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x8057d15 in UdmSplitCacheLog (log=300) at cache.c:635
635
 logwords[count+j].wrd_id=table[w].wrd_id;
(gdb) backtrace
#0  0x8057d15 in UdmSplitCacheLog (log=300) at cache.c:635
#1  0x8049f29 in main (argc=1, argv=0xbac4) at splitter.c:74
#2  0x4009bbfc in __libc_start_main (main=0x8049e20 , argc=1, ubp_av=0xbac4, 
init=0x8049630 <_init>, fini=0x8064f7c <_fini>, rtld_fini=0x4000d674 <_dl_fini>, 
stack_end=0xbabc)
at ../sysdeps/generic/libc-start.c:118

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-12 Thread Zenon Panoussis



Caffeinate The World skrev:
> 

> in my tests your 3 little files wouldn't make a difference. he would
> have to run splitter -p and splitter on all the files starting from the
> first original RAW file, including all the 31 MB file. i believe in my
> case it was the original 31mb file which caused the problem.

OK, I'll try to do this systematically now and write down everything 
I do. Bear with me if you get too much information at times; it's 
hard to know what could could be relevant and what not.

1. Installed v 3.1.10 with a new prefix (produced one error, see 
   the webboard on subject "warning: no newline at end of file")
2. Copied 319/var/* to 3110/var/ (oofff! took more than an hour)

   The raw directory looks like this: 
   -rw-r--r--1 root root17096 Feb  7 07:58 981529092.del.done
   -rw-r--r--1 root root 10060900 Feb  7 07:58 981529092.wrd.done
   -rw-r--r--1 root root16808 Feb  7 09:24 981534260.del.done
   -rw-r--r--1 root root 11374124 Feb  7 09:24 981534260.wrd.done
   -rw-r--r--1 root root13400 Feb  7 11:03 981540190.del.done
   -rw-r--r--1 root root 11698476 Feb  7 11:03 981540190.wrd.done
   -rw-r--r--1 root root20328 Feb  7 12:54 981546899.del.done
   -rw-r--r--1 root root  8055532 Feb  7 12:54 981546899.wrd.done
   -rw-r--r--1 root root 7312 Feb  7 14:52 981553965.del.done
   -rw-r--r--1 root root  4459360 Feb  7 14:52 981553965.wrd.done
   -rw-r--r--1 root root 9912 Feb  7 16:52 981561131.del.done
   -rw-r--r--1 root root  5254828 Feb  7 16:52 981561131.wrd.done
   -rw-r--r--1 root root14240 Feb  7 18:53 981568430.del.done
   -rw-r--r--1 root root 10220088 Feb  7 18:53 981568430.wrd.done
   -rw-r--r--1 root root  216 Feb  7 22:27 981581773.del.done
   -rw-r--r--1 root root   220988 Feb  7 22:27 981581773.wrd.done
   -rw-r--r--1 root root14088 Feb  8 22:40 981669855.del.done
   -rw-r--r--1 root root  8719924 Feb  8 22:40 981669855.wrd.done
   -rw-r--r--1 root root  136 Feb  8 23:05 981669947.del.done
   -rw-r--r--1 root root   125028 Feb  8 23:05 981669947.wrd.done
   -rw-r--r--1 root root 5288 Feb  9 01:51 981679960.del.done
   -rw-r--r--1 root root   396972 Feb  9 01:51 981679960.wrd.done
   -rw-r--r--1 root root 1192 Feb  9 03:32 981686015.del.done
   -rw-r--r--1 root root   693916 Feb  9 03:32 981686015.wrd.done
   -rw-r--r--1 root root 4008 Feb 11 21:56 981925017.del.done
   -rw-r--r--1 root root  1876884 Feb 11 21:56 981925017.wrd.done
   -rw-r--r--1 root root 4192 Feb 11 22:51 981928286.del.done
   -rw-r--r--1 root root  3349232 Feb 11 22:51 981928286.wrd.done
   -rw-r--r--1 root root 4096 Feb 11 23:45 981931533.del.done
   -rw-r--r--1 root root  1265304 Feb 11 23:45 981931533.wrd.done
   -rw-r--r--1 root root12944 Feb 12 02:56 981945565.del
   -rw-r--r--1 root root  6801160 Feb 12 02:56 981945565.wrd
   -rw-r--r--1 root root 9024 Feb 12 04:10 981993028.del
   -rw-r--r--1 root root  3751064 Feb 12 04:10 981993028.wrd
   -rw-r--r--1 root root0 Feb 12 16:50 del.log
   -rw-r--r--1 root root0 Feb 12 16:50 wrd.log

*  As you see, no 31 MB files; last time I got them they produced 
   segfaults, so I deleted them and went on, leaving the pages they 
   contained for the next re-indexing. This means that words that 
   should be in the word files according to the mysql database are 
   not there. I don't think it matters, but I cannot be sure. Actually, 
   depending on how words are indexed, this could be the cause of 
   the current segfaults. However, even so, this wouldn't change 
   the fact that the 31 MB files caused segfaults in the first place, 
   before I deleted them.

   In this context I should also mention that I am using non-ECC 
   memory. If splitting depends on the integrity of the pre-existing 
   word files, an error that has been entered by bad copying/writing 
   would affect all subsequent splitting attempts.

   This is what the database looks like:

   #du -c -h tree
   1.6Gtotal

   #./indexer -S
 Database statistics

StatusExpired  Total
   -
 0   4240   4240 Not indexed yet
   200  0  38945 OK
   301  0 52 Moved Permanently
   302  0312 Moved Temporarily
   304  0 65 Not Modified
   400  0  2 Bad Request
   403  0 35 Forbidden
   404  0   2133 Not found
   503  5  5 Service Unavailable
   504  1  1 Gateway

Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-12 Thread Zenon Panoussis



Alexander Barkov skrev:
> 

> > http://search.freewinds.cx/logs/logs.tar.gz
 
> Not Found

I'm senile. It's fixed (the 404, not the senility ;)

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-12 Thread Zenon Panoussis


Alexander Barkov skrev:
> 

> Could you please put zipped /var/mnogo319/tree/12/B/12BFD000 and
> a file /splitter/XXX.wrd with correspondent XXX.del which produce
> crash somewhere on the net?

http://search.freewinds.cx/logs/logs.tar.gz

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: warning: no newline at end of file

2001-02-12 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

Does it matter? 

/bin/sh ../libtool --mode=compile gcc -DHAVE_CONFIG_H -I. -I. -I../include 
-I../include  -I/usr/include/m
ysql  -g -O2 -DUDM_CONF_DIR=\"/usr/local/mnogo3110/etc\" 
-DUDM_VAR_DIR=\"/var/mnogo3110\" -c udmutils
.c
gcc -DHAVE_CONFIG_H -I. -I. -I../include -I../include -I/usr/include/mysql -g -O2 
-DUDM_CONF_DIR=\"/usr/local
/mnogo3110/etc\" -DUDM_VAR_DIR=\"/var/mnogo3110\" -Wp,-MD,.deps/udmutils.pp -c 
udmutils.c -o udmutils.o
udmutils.c:1560:9: warning: no newline at end of file

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1329>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Segfault (grrr)

2001-02-12 Thread Zenon Panoussis



Alexander Barkov skrev:
> 

> Can you guys give us a log file produced by splitter -p which caused
> crash? We can't reproduce crash :-(

Huh? splitter doesn't accept the -v5 argument, so it won't give 
more detailed logs than the normal ones. The only log I had, that 
to stdout, is the one I included with my first posting in this 
thread: 

  Delete from cache-file /var/mnogo319/tree/12/B/12BFD000 
/var/mnogo319/tree/12/C/12C1 old: 69 new: 1 total: 70 
./run-splitter: line 118: 18790 Segmentation fault (core dumped) $SPLITTER 

Until this point everything was normal. 

Anyway, as I said, I strongly suspect corruption in the word 
database. On a previous occasion when this happened, I deleted 
the entire tree/* directory structure and started all over again. 
Splitter worked like a dream with both small and big log files 
until one of the following occured:

1. I stopped indexer with ^C and then run splitter 
   or
2. Splitter had to work itself through some 31 MB files. (These 
   files are not all the same size; they tend to get slightly 
   bigger the more they are, i.e. something like this:
 0001.log31.500.000 bytes
 0002.log31.550.000 bytes 
 0003.log31.580.000 bytes
   sort of). 

Unfortunately I haven't been making notes, so I can't tell for 
sure which one of these two things happened before things stopped 
working. 

I tried splitter again today with ./splitter >splitter.log . It 
went in a very normal way *almost* as far as yesterday, and then 
hang so badly that not even kill -9 could kill it. The log of 
this run looks like 


Delete from cache-file /var/mnogo319/tree/12/B/12B27000
Delete from cache-file /var/mnogo319/tree/12/B/12B2D000
Delete from cache-file /var/mnogo319/tree/12/B/12B3
Delete from cache-file /var/mnogo319/tree/12/B/12B31000
Delete from cache-file /var/mnogo319/tree/12/B/12B3

I am attaching the three files that could be involved, 
namely tree/12/B/12B31000, 12B32000 and 12B35000. 


I'll install 3.1.10 now, try it on the old word database and see 
what it does. If it doesn't work, I'll remove the word database 
and start again from scratch. I'll try to make detailed notes this 
time and report back. 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
 wordfiles.tar.gz


UdmSearch: Webboard: Segfault (grrr)

2001-02-11 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
RH Linux 7.0, search 3.1.9, MySQL 3.23.29, cache mode, with the 
new patches for cache.c and sql.c. 

It happens all the time. It started happening when "maximum size" 
31 MB log files were indexed, but by now it happens on any indexing, 
no matter how big or small the log file, as if the database somehow 
was corrupt:

  Delete from cache-file /var/mnogo319/tree/12/B/12BFD000
  /var/mnogo319/tree/12/C/12C1 old:  69 new:   1 total:  70
  ./run-splitter: line 118: 18790 Segmentation fault  (core dumped) $SPLITTER

For the same log file it always crashes at the same index file 
(e.g. every time I try to reindex 12345678.log it will crash 
at tree/12/3/4567000). If I delete the log file and start again 
with a new log file, it will crash at a different place, but it 
will still be consistent in crashing at the same place every time. 

And the backtrace:

# gdb splitter core
GNU gdb 5.0
[...]
This GDB was configured as "i386-redhat-linux"...
Core was generated by `/usr/local/mnogo319/sbin/splitter'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/mysql/libmysqlclient.so.10...done.
Loaded symbols for /usr/lib/mysql/libmysqlclient.so.10
Reading symbols from /lib/libm.so.6...done.
Loaded symbols for /lib/libm.so.6
Reading symbols from /usr/lib/libz.so.1...done.
Loaded symbols for /usr/lib/libz.so.1
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/libcrypt.so.1...done.
Loaded symbols for /lib/libcrypt.so.1
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
#0  0x8059061 in UdmSplitCacheLog (log=300) at cache.c:552
552
 logwords[count+j].wrd_id=table[w].wrd_id;

(gdb) backtrace
#0  0x8059061 in UdmSplitCacheLog (log=300) at cache.c:552
#1  0x8049e89 in main (argc=1, argv=0xba94) at splitter.c:70
#2  0x4009bbfc in __libc_start_main (main=0x8049d80 , argc=1, ubp_av=0xba94, 
init=0x80495bc <_init>, fini=0x8065b7c <_fini>, rtld_fini=0x4000d674 <_dl_fini>, 
stack_end=0xba8c)
at ../sysdeps/generic/libc-start.c:118

Since 3.1.10 is coming out today, I'll try it and see if things 
work better. If not, I'll post more bad news later ;)

Z



Reply: <http://search.mnogo.ru/board/message.php?id=1320>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: A bug in search.cgi???

2001-02-11 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
> When searching words in spanish (accentuated characters, ñ) with search.cgi I get 
>results like the following:
> 
> If I search for «España», search.cgi breaks the word in two parts, searching for 
>«Espa» and also for «a», ignoring «ñ».

> Or perhaps I'm doing something wrong...

Have you set local charset to 8859-1? If not, do so in both 
indexer.conf and search.htm .

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1319>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Site search (ul) in cache mode

2001-02-07 Thread Zenon Panoussis



Zenon Panoussis skrev:
> 

> > We found a bug. Please find patches against sql.c and cache.c
> > in attachement.
 
> The patch didn't work by itself, so I did the replacements manually.
> The patched source compiled without complaints. I replaced the old
> search.cgi with the new one but site search still doesn't work.
> Should I re-run splitter or re-index completely?

I completely re-indexed some documents with indexer -a -g (category) 
with the patched compilation. Site search (ul) still doesn't work. 
I am using it in the form of 

and fill in http://www.domain.dom/ (domain.dom being one of the just 
re-indexed servers) before hitting "search". 

Now what? 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Site search (ul) in cache mode

2001-02-07 Thread Zenon Panoussis


Alexander Barkov skrev:
> 

> We found a bug. Please find patches against sql.c and cache.c
> in attachement.

The patch didn't work by itself, so I did the replacements manually. 
The patched source compiled without complaints. I replaced the old 
search.cgi with the new one but site search still doesn't work. 
Should I re-run splitter or re-index completely? 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Site search (ul) in cache mode

2001-02-07 Thread Zenon Panoussis


Alexander Barkov skrev:
> 

> > > > Now the tags and categories work fine, but not the site search.
> > > > The ul= directive is completely ignored by search.cgi.

> > > What was the value of ul=  variable you tryed?

> > I tried all of the following:
> > - http://www.domain.dom
> > - www.domain.dom
> > - domain
> > - /path/
> > - path

> > Nothing works. Actually, right now there is garbage text in the
> > ul variable and the search doesn't care about it. You can see
> > it at http://search.freewinds.cx -> hit "New search".
 
>   Check  http://www.domain.dom/  please with trailing slash.

That doesn't work either. 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Splitter: core dumped

2001-02-07 Thread Zenon Panoussis


Caffeinate The World skrev:
> 


> i'll wait. for now, i'm indexing but running splitter when the files
> are around 2MB.

I've been running indexer -c 3600 since last night, producing 
log files of 5-10 MB and running splitter every time afterwards, 
with cleaning of var/splitter and all. So far no problems at all. 
I have a hunch that the problem is to splitting multiple big 
files in one go. 

A friend offered to lend me some memory. If I can get my ass 
over there and fetch it, I'll try a huge splitting first with 
my standard 128 MB RAM and then with 1 GB RAM. If there is any 
difference in the behaviour of splitter, it will be a good 
indication of where to look for the problem.

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Splitter: core dumped

2001-02-07 Thread Zenon Panoussis


Caffeinate The World skrev:
> 

> > I run splitter -p and finish fine. I then run splitter and,
> > halfway through the splitting, crash: segmentation fault, or
> > just a hang, core dumped. So I restart splitter and next time
> > finish fine.

> what machine are you on? Alpha? OS?

Intel PII, RH Linux 7.0 with 2.2 kernet.

 
> i had the same problem and i sent a message to the mailing list
> describing how i corrected it.  search for "core" and "splitter"

Found it. My dump appeared at a different position than yours, 
at 076, but was just as persistent at yours. Also, the premises 
are similar: I had run indexer for a long time and I had five 
31 MB files waiting to be split. Splitter choked every time on 
the third one of them. This has never happened before or after 
when the logs have been smaller than 31 MB, so I'm just re-running 
smaller chunks at a time.


> can you check another thing? i've never seen my splitter split the
> lasta file "FFF.log". do you get that file? it goes as high as FFE.log
> only.

Indeed, last night I saw it stop at FFE.log . But I have had files 
at tree/FF/F/... , so I assume that other times it went all the 
way to FFF. 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Splitter: core dumped

2001-02-06 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

[3.1.9, cache mode]

I run splitter -p and finish fine. I then run splitter and, 
halfway through the splitting, crash: segmentation fault, or 
just a hang, core dumped. So I restart splitter and next time 
finish fine. 

The question is: what can this do to the word database? Will 
it still be accurate, or will some words be inserted twice? 
Can I just re-run and finish and be happy, or should I re-index? 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1271>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Search results display problem

2001-02-06 Thread Zenon Panoussis



Matthew Sullivan skrev:
> 

> > I have the same problem (see http://search.freewinds.cx ) and
> > I thought it was my own HTML that did it. If you find the cause,
> > will you please post it on the webboard?
 
> Yours looks ok to me.

That's only because I took your advise in the meanwhile.  All 
my tables were width="95%" and were contained in one big table 
of width="100%", except  , which was width="100%" 
itself. I changed it to 95% and the problem seems to be gone. 
Feel credited :)

BTW, you should post your reply to the webboard. I suspect that 
lots of people read it who are not on the list. Besides, it's 
searchable. 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Search results display problem

2001-02-06 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> I have a problem with the display of search results in netscape. 
> When ever I have a large number of results, the width of the 
> search results are wider than the td(table data) and extend 
> beyond it...

I have the same problem (see http://search.freewinds.cx ) and 
I thought it was my own HTML that did it. If you find the cause, 
will you please post it on the webboard?

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1270>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Webboard <-> mailing list interaction

2001-02-05 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

What is posted on the webboard goes to the mailing list too, but 
what is posted on the mailing list doesn't go to the webboard. 
Wouldn't it be a good idea if it did? 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1250>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Site search (ul) in cache mode

2001-02-05 Thread Zenon Panoussis



Alexander Barkov skrev:
> 

> > Now the tags and categories work fine, but not the site search.
> > The ul= directive is completely ignored by search.cgi.
 
> What was the value of ul=  variable you tryed?

I tried all of the following:
- http://www.domain.dom
- www.domain.dom 
- domain
- /path/
- path

Nothing works. Actually, right now there is garbage text in the 
ul variable and the search doesn't care about it. You can see 
it at http://search.freewinds.cx -> hit "New search".

BTW, if you go to the site in half an hour or so, "New search" 
will have been moved to "Search"; I'm just in the process of 
replacing the MySQL search with the cache mode one.

Z

-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Splitter

2001-02-05 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> Hi, Can Anybody point me to some documentation about the 
> Splitter, what it is , what it does, ...  Splitter -h doens't 
> realy help me further, and I haven't found an answer in the 
> mailing list or the included documentation. Thanks in advance. 

Check doc/cachemode.txt. splitter -h is not documented; probably 
it does not exist. run-splitter -k renames the current word logs 
and starts new ones. run-splitter -p or just splitter -p divides 
the renamed word log into 4096 files in basedir/var/splitter. 
run-splitter -s or just splitter divides those words in turn into 
1.000.000 files in basedir/var/tree. 

Z

Reply: <http://search.mnogo.ru/board/message.php?id=1245>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Search in title: weird behaviour

2001-02-04 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

Go to http://search.freewinds.cx , hit "New search", search for 
"zenon" (without the quotes) and limit the search to title only. 
You will get 6 results where the search term is in the title, and 
one (http://www.users.wineasy.se/noname/zenon/index.htm) where it 
is not. Is the indexer confusing title with URL? 

Z

PS. Since the indexing is going on, at the time you try this there 
might be more results. 

Reply: <http://search.mnogo.ru/board/message.php?id=1237>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: search.cgi does not work

2001-02-04 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> I have the following problem. If I try the search.cgi I 
> get an error message empty page! If I enter search.cgi 
> from the telnet seesion I get an valid html output but 
> all the vars from search.htm are empty (I mean you cannot 
> see anything $A is replaced with nothing)

The empty page you get when you try from the web, does it 
say "An error occured"? 


> Has anyone any idea what to try next?

First of all, did indexing go OK? Did your database grow 
as it should? If yes, check this: 

- Have you renamed search.htm-dist to search.htm? 
- Have you put the right DBAddr and user:password, DBMode 
  etc in search.htm? Do all settings in search.htm match 
  the equivalent settings in indexer.conf? 
- Have you set permissions correctly for search.htm? 
- Are you using search results cache? Did you try without? 
- Are you tracking queries? Did you try not to?
- Are you using ispell? Did you try without? 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1236>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Webboard: Site search (ul) in cache mode

2001-02-04 Thread Zenon Panoussis


> Have you solved the problem with tags and categories does not work ?
> How exactly if yos?

Oh - I posted it on the webboard: compile with --enable-fasttag 
--enable-fastcat and --enable-fastsite instead of --enable-fast-tag 
--enable-fast-cat and --enable-fast-site. 

Now the tags and categories work fine, but not the site search. 
The ul= directive is completely ignored by search.cgi. 

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Bug report

2001-02-04 Thread Zenon Panoussis



Luis Bravo skrev:
> 
 
> My files are in Spanish. We have words like oración, apéndice, 
> estómago, etc.  When they are indexed, indexer split that words.  
> In the database they are in two words: oraci n, ap ndice, est mago.  
> What Can I do?

In later versions you need to set 
  LocalCharset iso-8859-1 
both in indexer.conf and in search.htm . If you don't, US-ASCII 
is assumed and all accented characters are discarded. I don't know 
if the LocalCharset directive was already used in 3.1.2 .

Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Site search (ul) in cache mode

2001-02-03 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

Solve one problem and get on to the next :(

Search 3.19 compiled with --enable-fastsite, mysql, cache mode. 
The site search doesn't work. Tried both
  http://a_site/">
and
  
  All sites
  Site 1
  Site 2
  
to no avail. Either way, search.cgi returns results from all 
indexed sites. 

And yes, I double-checked. ./configure said
  checking for fast site search support... enabled
search.cgi and splitter come from this compilation and 
everything has been re-indexed with it, so it should work.

Any ideas?

Z



Reply: <http://search.mnogo.ru/board/message.php?id=1232>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Tags and categories in cache mode

2001-02-03 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
> More problems: neither tags nor categories seem to work.
> 
> I'm using v 3.1.9 with MySQL in cache mode, compiled with 
> --enable-fast-tag/cat/site ...

Found the problem: 
./configure --enable-fast-tag --enable-fast-cat --enable-fast-site 
returns 

...
checking for fast tag search support... disabled
checking for fast category search support... disabled
checking for fast site search support... disabled
...

./configure --enable-fasttag --enable-fastcat --enable-fastsite 
works much better :)

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1231>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Tags and categories in cache mode

2001-02-02 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
More problems: neither tags nor categories seem to work.

I'm using v 3.1.9 with MySQL in cache mode, compiled with 
--enable-fast-tag/cat/site . I've read the part on fast 
search with tag etc limits in cachemode.txt, but I doubt 
I understood it properly. My indexer config looked like 
this:

Tag A
(lots of Server statements)
Tag B
(lots of Server statements)

This didn't work. Since cachemode.txt talks about a 10-digit 
HEX string, I replaced the above with "Tag 10" and 
"Tag 20" and with "Category 10" and 
"Category 20" respectively, deleting all indexes 
and re-indexing from scratch every time. Neither alternative 
worked. 

My search.htm contains the options 

All sites
One
Another
A third
A fourth (n/a)
 

No matter what you choose, you get results from all sites. 

Any ideas? Is it me or is it the search engine?

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1230>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Cache mode questions

2001-02-02 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
>  
> > The search works very nicely, but it returns a tremendous 
> > amount of quoted document data...

> This is because of --enable-news-extensions

Is there *any* way to limit the quotes to just a few lines? 
If not, is there any chance this can be fixed in 3.1.10?  

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1229>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Is this normal?

2001-02-02 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> > Isn't 55 MB a bit much just for storing 15.000 URLs?  

> That's very big size.  Do you use --enable-news-extensions?

Yes. But I haven't indexed any news yet. By now I have 
21.500 URLs and an index of 158 MB, all from ordinary 
webpages.

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1226>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Cache mode questions

2001-02-02 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> > The search works very nicely, but it returns a tremendous 
> > amount of quoted document data...
 
> Can I take a look on your search page?

Yes. Go to http://search.freewinds.cx and use "New search". 
Search for the word "something" and format "Long" and you'll 
get a results page that's almost half a megabyte. 

BTW, there is some other strange behaviour there. Searching 
for beginning of word or substring doesn't work at all. Ispell 
is not enabled, but as I understand it doesn't need to be either. 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1225>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Is this normal?

2001-02-01 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

Isn't 55 MB a bit much just for storing 15.000 URLs?  After 
all, it's only URLs stored, isn't it? Or am I wrong? Is 
anything else stored in url.myd?

The statistics:

  Database statistics

StatusExpired  Total
   -
 0   9147   9373 Not indexed yet
   200  0   6009 OK
   301  0 14 Moved Permanently
   302  0 38 Moved Temporarily
   400  0  1 Bad Request
   403  0 14 Forbidden
   404  0 97 Not found
   503  0  4 Service Unavailable
   -
 Total   9147  15550

And the MySQL database:

[root@goat /root]# ls -l /var/lib/mysql/criticscache/
total 56308
-rw-rw1 mysqlmysql   0 Feb  1 16:55 dict.MYD
-rw-rw1 mysqlmysql1024 Feb  1 16:55 dict.MYI
-rw-rw1 mysqlmysql8608 Feb  1 16:55 dict.frm
-rw-rw1 mysqlmysql   0 Feb  1 16:55 robots.MYD
-rw-rw1 mysqlmysql1024 Feb  1 16:55 robots.MYI
-rw-rw1 mysqlmysql8586 Feb  1 16:55 robots.frm
-rw-rw1 mysqlmysql   0 Feb  1 16:55 stopword.MYD
-rw-rw1 mysqlmysql1024 Feb  1 16:55 stopword.MYI
-rw-rw1 mysqlmysql8578 Feb  1 16:55 stopword.frm
-rw-rw1 mysqlmysql   0 Feb  1 16:55 thread.MYD
-rw-rw1 mysqlmysql1024 Feb  1 16:55 thread.MYI
-rw-rw1 mysqlmysql8584 Feb  1 16:55 thread.frm
-rw-rw1 mysqlmysql56568860 Feb  2 05:20 url.MYD
-rw-rw1 mysqlmysql  944128 Feb  2 05:20 url.MYI
-rw-rw1 mysqlmysql9358 Feb  1 16:55 url.frm

v 3.1.9, DBMode cache with MySQL 3.23.29. 

Z



Reply: <http://search.mnogo.ru/board/message.php?id=1218>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Cache mode questions

2001-02-01 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
 
>Shouldn't the files in /var/raw also be deleted? Or are they 
>needed in any way? 

/Me stupid. The answer is in cachemode.txt: "All processed logs 
in /var/raw directory are renamed to *.done ... you can remove 
them or keep them for backup purposes". Please forget I asked. 


> 2. The search works very nicely, but it returns a tremendous 
>amount of quoted document data...

Re-reading the documentation, I haven't found the answer to this 
one. If you know, pray, tell. 

Z



Reply: <http://search.mnogo.ru/board/message.php?id=1217>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Cache mode questions

2001-02-01 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

As more things work, more questions arise. v 3.1.9 in DBMode cache, 
compiled with news-extension and using MySQL with create.txt from 
the news-3.1.tar.gz module. 


1. cachemode.txt says that after running splitter, "it is better
   to delete (or backup) files in /var/splitter directory". 
   Shouldn't the files in /var/raw also be deleted? Or are they 
   needed in any way? 
2. The search works very nicely, but it returns a tremendous 
   amount of quoted document data with each hit; often the entire 
   document. You can see this if you search for "picket" on 
   http://search.freewinds.cx/cgi-bin/search2.cgi . If you do 
   this, you will get a results page of about 0.5 MB. How can 
   the quoted text be limited to, say, four lines? 
3. I am using tags to separate different types of sites. Is it 
   possible to use a tag for news? That is, use tag A for some 
   websites, tag B for other websites and tag C for news? How 
   can this be done? 

BTW, I found out by coincidence that if search.cgi and search.htm 
are renamed the same way, e.g. search-X.cgi and search-X.htm 
respectively, it is possible to run separate searches on the same 
or on different databases from the same cgi-bin directory. This 
can be very useful and should be documented.

Finally, thank you for an excellent job done. 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1214>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: New behaviour of indexer.conf

2001-02-01 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
The format 
   Server path http://bladiblah/  #comment1 comment2 
in indexer.conf used to be OK until 3.1.8. However, in 3.1.9 
indexer skips the first comment, reads the second one and exits 
with the error 
   too many arguments: ´comment2´
(and BTW, "argument" is misspelled ;)

Z


Reply: <http://search.mnogo.ru/board/message.php?id=1215>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Indexing /usr/doc ?

2001-01-02 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
 
> I\\\'m trying this:
> 
> Server file://usr/doc/
> 
> But it doesn\\\'t work. Is UdmSearch able to recursively index
> directories? The documentation doesn\\\'t say anything about this. 
> I\\\'m using version 3.0.23.

That will never work by itself. The documentation of v 3.1.8 says 
this on the subject:

#Alias  
# You can use this command for example to organize search through 
# master site by indexing a mirror site. It is also usefull to
# index your site from local file system.
# UdmSearch will display URLs from  while searching
# but go to the  while indexing.
# This command has global indexer.conf file effect. 
# You may use several aliases in one indexer.conf.
#Alias http://www.mysql.com/ http://mysql.udm.net/
#Alias http://www.site.com/  file:/usr/local/apache/htdocs/

Thus, what you need to do is 
  Alias http://URL_that_you_want_in_the_results file:/usr/doc/
  Server path http://URL_that_you_want_in_the_results
and run the indexer. 

Now, if you would like the results to point to files as well 
instead of to a proper URL you might try 
  Alias file:///usr/doc/ file:/usr/doc/
  Server path file:///usr/doc/ 
but I have no clue whether it will work or not. 

In any case you might want to check the change log first to see 
if the Alias directive works with your version.

Z


Reply: <http://search.mnogo.ru/board/message.php?id=971>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Restricted Search - how da hell does it worx!?!?!?

2001-01-02 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> I can\\\'t find a way to make the Restricted Search work! i can\\\'t find in
> the DB any data that says that a specific URL is relative to a
> restricted criteria (like Sports or Shopping, wich are given as an
> example in the search.php of the latest MnogoSearch!). How does it
> work? and how can i use it in my queries?

If you only have a few categories, the easiest way to do this is 
to use tags. Put in indexer.conf the following:

==
Tag A
Server site http://www.domain.fr/
Server site http://www.otherdomain.it/
Server path http://www.3rddomain.de/shop/

Tag B
Server site http://www.domain.com/
Server site http://www.otherdomain.com/
Server path http://www.3rddomain.com/shop/
==

Then put the following in search.htm:
==

All sites
European sites
US sites
Reserved

==
and you\'re ready. 

If you have a more complex system of classification you might 
want to use categories instead. They work basically the same 
way and they are properly described in the documentation. 

Z



Reply: <http://search.mnogo.ru/board/message.php?id=970>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Cosmetic correction

2001-01-02 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

Minor, trivial stuff: indexer -S returns the caption \"UdmSearch statistics\". Since 
that can end up public, as for instance in http://search.freewinds.cx/cgi-bin/stats , 
you might want to change it. 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=961>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: WORDCHAR and CONTRACTIONCHARS

2001-01-01 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

There was a discussion about word separators back in January; see 
http://www.mail-archive.com/udmsearch%40web.izhcom.ru/msg00200.html .
Since I just realised that I am facing the same problem, I wonder 
if Charlie\'s idea was implemented in newer versions. 

If not, will it be? 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=959>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: How can users add their homepage to the index?

2001-01-01 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
> Since 3.1.7 we have new "ServerTable" feature.
> Currently we have no front-end to add URLs into
> server tables, but we have a plan to make it soon.
> 


I use the following form and CGI for this purpose. You will, of 
course, have to adjust them for your needs. See further comments 
at the bottom.

The form:
=











Submit a URL for indexing.










Choose how to limit the indexing. Read the &$%@#!! 
instructions first.






Limit to:



Entire domain:



Directory and below:




Single page only:







Choose a category:



Critics:



Free Zone:



Media:






Done? Is everything correct? Then 







=

And the script:
=
#!/usr/bin/perl -w 

use strict;
use CGI \':standard\';
use CGI::Carp \'fatalsToBrowser\';
$CGI::POST_MAX=1024;

my ($query) = @_;
import_names(\'R\');
my $h = remote_host();

my $url = $R::url;
my $limit = $R::limit;
my $category = $R::category;
my $d = localtime;

$url =~ tr/ a-zA-Z0-9!.,:_\\-#\\$%&+\\[\\]=\\?\\/\\~//cd;
$limit =~ tr/a-z//cd;
$category =~ tr/ a-z//cd;

open(FILE, \">>/usr/local/mnogosearch/etc/indexer.conf\"); 
print FILE \"\\#Server $limit $url in $category from $h on $d\\n\";  
close(FILE);

print header();
print start_html(-title=>\'URL submission completed\');
print \"\\n\";
print \"You submitted $url for indexing as a $limit in the category 
$category.\\n\";
print \"The page(s) will be examined and added to the index within a few days.\";
print \"\";
print \"\\n\\n\";
print end_html();
=

As you probably can see, this is a bit primitive; the tag is 
addedd at the end of the \"server\" directive and leaves me to 
manually move the submitted URLs to the right place. For me 
this works fine because I don\'t trust submissions, but need 
to check them manually anyway. If you trust submissions (and 
trust that nobody will spam your index by filling it with 
irrelevant shit), you can add an \"if/then\" statement to put 
the appropriate tag or category before the URL. If you don\'t 
use tags or categories, the script will work fine as it is; 
just remove the \"in $category from $h on $d\\n\" part from the 
\"print FILE\" statement.

Also note that I am not using the -T option. For your own 
safety you should. Also, you might want to check and possibly 
restrict the funny characters that you allow in URLs (the 
\"$url =~ tr/ a-zA-Z0-9!.,:_\\-#\\$%&+\\[\\]=\\?\\/\\~//cd;\" statement. 
Mine is perhaps a bit too generous. 

You can see both this and a few other scripts in action at 
http://search.freewinds.cx .

Z



Reply: <http://search.mnogo.ru/board/message.php?id=957>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Oops!

2000-12-27 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

Beginning today or yesterday, the search at http://search.mnogo.ru/search/search.php3 
returns \"Fatal error: Cannot redeclare crc32() in 
/usr/apache/search.mnogo.ru/share/htdocs/search/crc32.inc on line 11\". Looks bad for 
the new front end ;)

Z


Reply: <http://search.mnogo.ru/board/message.php?id=939>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Search.cgi displaying no result

2000-12-10 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

> Search.cgi is not displaying any search result, only returing 
> the search form.
> DBMode and DBAddr is the same as in indexer.

Does it say "Sorry, an error occured"? 

If it does, try commenting out Cache, TrackQuery and Ispell 
in search.htm to see if any of them is causing the problem. 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=903>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Mirroring

2000-12-10 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

If a page that had previously been indexed has been removed from the web, indexer will 
remove it from the database when re-run. If mirroring is on, will indexer also remove 
the page copy from the mirror?

Z



Reply: <http://search.mnogo.ru/board/message.php?id=902>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Pedantic

2000-12-09 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

Cosmetic bug without any consequence: 
If the download timeout limit is set to 8, the indexer will try 9 URLs on a server 
before it skips it, not only 8. Therefore I assume that one part of the programme is 
counting from 0 up and another one from 1 up. I suggest you don't bother fixing it 
unless you happen to be looking at that code anyway.

Z


Reply: <http://search.mnogo.ru/board/message.php?id=899>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: PHP front end and categories

2000-11-28 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
search.cgi is been working fine here, but experimenting with the PHP front end I run 
into problems:

 Query error: SELECT path,link,name FROM categories WHERE path LIKE
 '__' ORDER BY NAME ASC 
 Table 'db.categories' doesn't exist

I don't use categories. I don't want to use categories. How do I get rid of this? (But 
I do use tags. How do I put in tags instead?)

Z


Reply: <http://search.mnogo.ru/board/message.php?id=817>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: "An error occured" in search.cgi

2000-11-26 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
 
> I've copied the search.cgi to my cgi directory and edited the 
> search.htm. When I type a search word in the search form, it thinks 
> for a little while, then returns with a little red text saying 
> "An error occured!" ... obviously in the area where the 
> search result was supposed to be printed.
 
See to it that the settings in search.htm really correspond to those 
in indexer.conf. Specifically, the DBAddr and DBMode lines must be identical in both 
files. Also, try commenting out TrackQuery and Cache and see if either one is causing 
the problem.


Reply: <http://search.mnogo.ru/board/message.php?id=809>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Cache search

2000-11-26 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

What hardware is http://udm.aspseek.com/cgi-bin/search.cgi running on? 

Reply: <http://search.mnogo.ru/board/message.php?id=808>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Installation Help

2000-11-23 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
 
> Can't open template file '/usr/local/udmsearch/etc/search.htm'! 
> 
> There is a search.htm-dist in that directory which I tried to rename by because of 
>permissions I could not.  Any help would be appreciated.

:) I did the same thing myself: forgot to copy search.htm-dist to search.htm . Do so, 
:edit th new file for database name, user and password, table type and cache cor not, 
:and your problem will be solved. 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=789>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: No 'Server' command for url... deleted.

2000-11-21 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
> What does "No 'Server' command for url... deleted."
> means when I run Indexer.

It means 
(a) that you have set DeleteNoServer to "yes" in indexer.conf
(b) that you have at some point had a Server path line in indexer.conf for the site 
that is being deleted, and that the site had been indexed and 
(c) that you removed the Server path statement, so on your next indexing run all the 
indexed pages of that site have been deleted from the database. 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=780>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Limitations?

2000-11-21 Thread Zenon Panoussis


Are there any inherent limitations on how long the Server path 
list can get? Would the indexer work with, say, a 2 MB list of 
URLs to index, or would it choke? 

Z
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Mysql query blues

2000-11-17 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

THANK YOU! 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=770>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Mysql query blues

2000-11-17 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

This is a stupid question. Please bear with a total newbie to mysql.

mysql>SELECT url FROM url WHERE status="404";
works fine and returns all the 404s. However, 
mysql>SELECT url FROM url WHERE status="404" AND url="%domain%";
returns "empty set" despite the fact that there are 404s in the domain in question. 
More weirdly, even 
mysql>SELECT url FROM url WHERE url="*";
returns an empty set. 

What am I doing wrong? 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=768>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Rotating indexing

2000-11-17 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
> I think that the right way to those webmasters is to use robots.txt.

Oh, I don't mean that they don't want their sites indexed at all; only that they get 
grumpy if you hit them with 100 requests per minute. Rotating the targets would be a 
way to put the indexer in "polite mode" without adding delays to the indexing itself. 

Z

Reply: <http://search.mnogo.ru/board/message.php?id=766>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Rotating indexing

2000-11-16 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

Some webmasters can be terribly grumpy about their servers being hit continuously from 
the same IP, even if they are sitting on monster machines capable of serving any 
number of requests. So I wonder if there is any way to force the indexer to rotate 
between sites. That is, to make it change site for every page it fetches (if multiple 
sites are indexed) instead of first indexing all pages on one server (or one depth 
level) and then moving to the next.

Z


Reply: <http://search.mnogo.ru/board/message.php?id=763>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: How to index pages and docs that are not linked?

2000-11-16 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
> How can I index html pages or other documents, that are not linked from other pages ?
> 
> I found that indexer only processes the pages that are linked from the main page and 
>further.
> When I put a 'loose' html document in my serverroot and I have no link to it, it 
>will not be indexed !?


The spider cannot guess that the document is there, so of course it can't find it. 

There are a couple of things you can do to solve this. One is to add a "Server page 
full_name_of_document" statement in your index.conf file. Another is to remove your 
index.html file(s) temporarily, make sure that the webserver allows directory 
browsing, index the site and then put back index.html. 

Z


Reply: <http://search.mnogo.ru/board/message.php?id=762>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Another mirroring suggestion

2000-11-15 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

If a directory is first indexed and mirrored, and then removed from the indexer.conf 
file with DeleteNoServer=yes, the indexer does not delete the mirrored files. I think 
it should, because otherwise the mirrors grow forever with obsolete files and 
eventually become useless both for off-line indexing and for actual mirroring.

Z


Reply: <http://search.mnogo.ru/board/message.php?id=758>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Mirroring

2000-11-14 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

If MirrorRoot is specified in indexer.conf, 
mnogosearch copies the files it indexes to 
directories such as mirror_root/http/domain/dir .
I can see three possible improvements in the 
mirroring behaviour. The first two should be easy 
to fix, while the third is more of a long-term 
improvement: 

1. The ../http/.. directory could be eliminated. 
   Not only is it unnecessary, but it also gives 
   an ugly directory structure if you would want 
   to actually make the mirror accessible to 
   outsiders. 
2. Only files that are actually indexed are 
   mirrored. This is very sensible for indexing 
   purposes, but it defeats other possible uses 
   of a mirror, such as backup or protection of 
   a site from being forcefully taken down. There 
   should be a MirrorAll command to override the 
   Allow and Dissallow commands and force *all* 
   files to be mirrored, while the Allow and 
   Disallow commands still apply to what is 
   actually indexed. 
3. If the indexer can be used as a combined 
   mirroring and indexing tool, then functionality  
   could be added to translate internal absolute 
   links to either relative links or translated 
   links.  E.g., if I would index and mirror the 
   mnogosearch site , the indexer could translate 
   all http://search.mnogo.ru/whatever links in 
   the mirrored pages to /whatever or to 
   http://mysite/mirrors/whatever. I suspect that 
   lots of the code for this could be taken from 
   the wget code. 

Z

   

Reply: <http://search.mnogo.ru/board/message.php?id=750>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Looping URLs

2000-11-12 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
Using 3.1.8 with MySQL 3.23.24 on RH7.

I am indexing part of a site with 
Server Path http://site/dir/dir/dir/ . Everything 
in the directories to be indexed is normal HTML 
with no funny stuff. Most directory indices are 
auto-generated by the web server. Yet the indexer 
loops. It reads the auto-generated index at / , 
adds the files in it to the database, re-reads it 
with a new name, adds new files to be indexed, ad 
infinitum. Left alone overnight, a few directories
containing a couple of thousand files, produced a 
pile of more than 90.000 entries with status 0 or 
200. 

The problem directories are at 
Server Path http://www.xs4all.nl/~kspaink/cos/ and 
there are no dynamically created pages in them. 
The resulting URLs in the database look like 
http://www.xs4all.nl/~kspaink/cos/SecrServ/ops/go732/?952181106go732xhtmgo732l.htmgo732i.htmgo732q.htmgo732.htm
 

You see the loop. The only really existing dir is 
the one before the question mark. The question 
mark itself and all what comes after it are 
"invented" by the indexer. 

There are references in the to-be-indexed pages to 
URLs higher than the to-be-indexed path in the 
form of . Could that 
be confusing the indexer?

Z



Reply: <http://search.mnogo.ru/board/message.php?id=731>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Webboard: Manually Deleting BAD Urls

2000-11-12 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:

Try "indexer -C -s 403" or whatever status URLs you want to get rid of.

Z


> Allo,
> I forgot to switch on the DeleteBad to YES...
> now we have about 26K of bad URLS.. can I delete then manually via MyAdmin...I do 
>not wish to use -a becuase my bandwidth is limited..
> 
> <PRE>
>  StatusExpired  Total
>-
>  0  56536  56536 Not indexed yet
>  1  1  1 Unknown status
>200  36793  66223 OK
>301  1  6 Moved Permanently
>302   7891  16368 Moved Temporarily
>304   1353   1377 Not Modified
>400  4  4 Bad Request
>401  1937 Unauthorized
>403  10652  10652 Forbidden
>404625625 Not found
>503128128 Service Unavailable
>504  20875  71462 Gateway Timeout
>-
>  Total 134860 224319 
> </PRE>
> 
> AJ Khan

Reply: <http://search.mnogo.ru/board/message.php?id=730>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Grrrr!

2000-11-12 Thread Zenon Panoussis


Problem solved. There was a pointer to mysql in /etc/ld.so.conf 
that pointed to the wrong place. Correcting the pointer and 
recompiling mnogosearch didn't help. I ended up removing the 
pointer, uninstalling mysql completely, reinstalling it again, 
and then recompiling and reinstalling mnogosearch. Everything 
works properly now. 

Z

Original message=

> mnogosearch 3.1.8, mysql 3.23.22
> 
> This happened: 
> 
> The search worked fine. Then I re-installed MySQL (3.23 instead 
> of 3.22) and Apache, and the directory structure of both changed. 
> I moved the old search.cgi to the new cgi-bin. I exported the old 
> database with mysqldump and re-imported it in the new MYI/MYD 
> format in the same (deleted and re-created) database. The indexer 
> works fine in the new setup with the old configuration. The search 
> does not; it returns "an error occured". 
> 
> This is what I tried: 
> 
> - Searched the Apache and MySQL error logs. Nothing there. Most 
>   important, there are no "access denied" messages in the mysql log, 
>   meaning that the search never even reaches mysql before it fails.
> - Recompiled and reinstalled mnogosearch and copied the new search.cgi 
>   to cgi-bin. It didn't help.
> - Double-checked search.htm. This shouldn't be necessary since both 
>   the database and search.htm are the same as before, but anyway. The 
>   DBAddr statement is identical to the one in indexer.conf, including 
>   trailing slash. So are the DBMode and charset statements. 
> - Beat my wife, screamed to the dog, kicked my children and broke my 
>   monitor. That didn't help either. 
> 
> Finally I straced search.cgi, but I don't understand the output. If 
> you do, you'll find it below. 
> 
> Any ideas? 
> 
> Z
> 
> =strace.out=
> 
> execve("/var/www/cgi-bin/search.cgi", ["/var/www/cgi-bin/search.cgi"], [/*
> 24 vars */]) = 0
> _sysctl({{CTL_KERN, KERN_OSRELEASE}, 2, "2.2.16-22", 9, NULL, 0}) = 0
> brk(0)  = 0x80908c0
> old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x40016000
> open("/etc/ld.so.preload", O_RDONLY)= -1 ENOENT (No such file or
> directory)
> open("/etc/ld.so.cache", O_RDONLY)  = 4
> fstat64(4, 0xb32c)  = -1 ENOSYS (Function not
> implemented)
> fstat(4, {st_mode=S_IFREG|0644, st_size=21769, ...}) = 0
> old_mmap(NULL, 21769, PROT_READ, MAP_PRIVATE, 4, 0) = 0x40017000
> close(4)= 0
> open("/usr/lib/mysql/libmysqlclient.so.9", O_RDONLY) = 4
> fstat(4, {st_mode=S_IFREG|0755, st_size=196204, ...}) = 0
> read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0 d\0\000"..., 4096)
> = 4096
> old_mmap(NULL, 172480, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x4001d000
> mprotect(0x40036000, 70080, PROT_NONE)  = 0
> old_mmap(0x40036000, 69632, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4,
> 0x18000) = 0x40036000
> old_mmap(0x40047000, 448, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40047000
> close(4)= 0
> open("/lib/libm.so.6", O_RDONLY)= 4
> fstat(4, {st_mode=S_IFREG|0755, st_size=493588, ...}) = 0
> read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300I\0"..., 4096)
> = 4096
> old_mmap(NULL, 125352, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40048000
> mprotect(0x40066000, 2472, PROT_NONE)   = 0
> old_mmap(0x40066000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4,
> 0x1d000) = 0x40066000
> close(4)= 0
> open("/usr/lib/libz.so.1", O_RDONLY)= 4
> fstat(4, {st_mode=S_IFREG|0755, st_size=58940, ...}) = 0
> read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\\36\0"..., 4096)
> = 4096
> old_mmap(NULL, 54064, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40067000
> mprotect(0x40073000, 4912, PROT_NONE)   = 0
> old_mmap(0x40073000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4,
> 0xb000) = 0x40073000
> close(4)= 0
> open("/lib/libc.so.6", O_RDONLY)= 4
> fstat(4, {st_mode=S_IFREG|0755, st_size=4686077, ...}) = 0
> read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\230\270"..., 4096)
> = 4096
> old_mmap(NULL, 1167368, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) =
> 0x40075000
> mprotect(0x40189000, 36872, PROT_NONE)  = 0
> old_mmap(0x40189000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4,
> 0x113000) = 0x40189000
> old_mmap(0x4018f000, 12296, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4018f000
> close(4)= 0
> open("/lib/libnsl.so.1", O_RDONLY)  = 4
> fstat(4, {st_mode=S_IFREG|0755, st_size=392107, ...}) = 0
> read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0p?\0\000"..., 4096)
> = 4096
> old_mmap(NULL, 93120, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40193000
> mprotect(0x401a7000, 11200, PROT_NONE)  = 0
> old_mmap(0x401a7000, 4

UdmSearch: Grrrr!

2000-11-11 Thread Zenon Panoussis

 
mnogosearch 3.1.8, mysql 3.23.22

This happened: 

The search worked fine. Then I re-installed MySQL (3.23 instead 
of 3.22) and Apache, and the directory structure of both changed. 
I moved the old search.cgi to the new cgi-bin. I exported the old 
database with mysqldump and re-imported it in the new MYI/MYD 
format in the same (deleted and re-created) database. The indexer 
works fine in the new setup with the old configuration. The search 
does not; it returns "an error occured". 

This is what I tried: 

- Searched the Apache and MySQL error logs. Nothing there. Most 
  important, there are no "access denied" messages in the mysql log, 
  meaning that the search never even reaches mysql before it fails.
- Recompiled and reinstalled mnogosearch and copied the new search.cgi 
  to cgi-bin. It didn't help.
- Double-checked search.htm. This shouldn't be necessary since both 
  the database and search.htm are the same as before, but anyway. The 
  DBAddr statement is identical to the one in indexer.conf, including 
  trailing slash. So are the DBMode and charset statements. 
- Beat my wife, screamed to the dog, kicked my children and broke my 
  monitor. That didn't help either. 

Finally I straced search.cgi, but I don't understand the output. If 
you do, you'll find it below. 

Any ideas? 

Z

=strace.out=

execve("/var/www/cgi-bin/search.cgi", ["/var/www/cgi-bin/search.cgi"], [/* 24 vars 
*/]) = 0
_sysctl({{CTL_KERN, KERN_OSRELEASE}, 2, "2.2.16-22", 9, NULL, 0}) = 0
brk(0)  = 0x80908c0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x40016000
open("/etc/ld.so.preload", O_RDONLY)= -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)  = 4
fstat64(4, 0xb32c)  = -1 ENOSYS (Function not implemented)
fstat(4, {st_mode=S_IFREG|0644, st_size=21769, ...}) = 0
old_mmap(NULL, 21769, PROT_READ, MAP_PRIVATE, 4, 0) = 0x40017000
close(4)= 0
open("/usr/lib/mysql/libmysqlclient.so.9", O_RDONLY) = 4
fstat(4, {st_mode=S_IFREG|0755, st_size=196204, ...}) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0 d\0\000"..., 4096) = 4096
old_mmap(NULL, 172480, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x4001d000
mprotect(0x40036000, 70080, PROT_NONE)  = 0
old_mmap(0x40036000, 69632, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x18000) = 
0x40036000
old_mmap(0x40047000, 448, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, 
-1, 0) = 0x40047000
close(4)= 0
open("/lib/libm.so.6", O_RDONLY)= 4
fstat(4, {st_mode=S_IFREG|0755, st_size=493588, ...}) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300I\0"..., 4096) = 4096
old_mmap(NULL, 125352, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40048000
mprotect(0x40066000, 2472, PROT_NONE)   = 0
old_mmap(0x40066000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x1d000) = 
0x40066000
close(4)= 0
open("/usr/lib/libz.so.1", O_RDONLY)= 4
fstat(4, {st_mode=S_IFREG|0755, st_size=58940, ...}) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\\36\0"..., 4096) = 4096
old_mmap(NULL, 54064, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40067000
mprotect(0x40073000, 4912, PROT_NONE)   = 0
old_mmap(0x40073000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0xb000) = 
0x40073000
close(4)= 0
open("/lib/libc.so.6", O_RDONLY)= 4
fstat(4, {st_mode=S_IFREG|0755, st_size=4686077, ...}) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\230\270"..., 4096) = 4096
old_mmap(NULL, 1167368, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40075000
mprotect(0x40189000, 36872, PROT_NONE)  = 0
old_mmap(0x40189000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x113000) 
= 0x40189000
old_mmap(0x4018f000, 12296, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, 
-1, 0) = 0x4018f000
close(4)= 0
open("/lib/libnsl.so.1", O_RDONLY)  = 4
fstat(4, {st_mode=S_IFREG|0755, st_size=392107, ...}) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0p?\0\000"..., 4096) = 4096
old_mmap(NULL, 93120, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x40193000
mprotect(0x401a7000, 11200, PROT_NONE)  = 0
old_mmap(0x401a7000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 4, 0x13000) = 
0x401a7000
old_mmap(0x401a8000, 7104, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, 
-1, 0) = 0x401a8000
close(4)= 0
open("/lib/libcrypt.so.1", O_RDONLY)= 4
fstat(4, {st_mode=S_IFREG|0755, st_size=82333, ...}) = 0
read(4, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\200\17"..., 4096) = 4096
old_mmap(NULL, 184252, PROT_READ|PROT_EXEC, MAP_PRIVATE, 4, 0) = 0x401aa000
mprotect(0x401af000, 163772, PROT_NONE) = 0
old_mmap(0x401af000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVA

UdmSearch: Refusing to index

2000-10-30 Thread Zenon Panoussis


v 3.1.8:

I have been indexing, and once in a while ^C-ing the indexer 
in order to do something else. The current progress status 
looks like this:

  [root@goat /root]# /usr/local/mnogosearch/sbin/indexer -S

  UdmSearch statistics

StatusExpired  Total
   -
 0  0604 Not indexed yet
   200  0   4524 OK
   301  0  9 Moved Permanently
   302  0  6 Moved Temporarily
   304  0311 Not Modified
   403  0  1 Forbidden
   404  0 86 Not found
   503  0 19 Service Unavailable
   504  0 26 Gateway Timeout
   -
 Total  0   5586

So I start the indexer again, and hre is what it does: 

  [root@goat /root]# /usr/local/mnogosearch/sbin/indexer 
  Indexer[1252]: indexer from UdmSearch v.3.1.8/MySQL started with
'/usr/local/mnogosearch/etc/indexer.conf'
  Indexer[1252]: [1] Done (1 seconds)

Namely nothing. It has 604 unwalked URls, yet it refuses to 
walk them. 

To make sure, I add a couple of new Server statements to the 
indexer.conf file and try again:


  Indexer[1068]: indexer from UdmSearch v.3.1.8/MySQL started with
'/usr/local/mnogosearch/etc/indexer.conf'
  Indexer[1152]: [1]
http://www.cedar.net/users/dvanhorn/Gallery/arscc.htm
  Indexer[1152]: [1] http://www.cedar.net/robots.txt
  Indexer[1152]: [1] http://www.cisar.org/
  Indexer[1152]: [1] http://www.cisar.org/robots.txt
  Indexer[1152]: [1] Done (102 seconds)

That was the new sites to be indexed. No go. The indexer fetches 
robots.txt and then refuses to walk. 

Until now everything seemed to work perfectly well. Any ideas on 
what might be causing this weird behaviour? 

Regards,
Z


-- 
oracle@everywhere: The ephemeral source of the eternal truth...
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Bug report

2000-10-30 Thread Zenon Panoussis



UdmSearch version: 3.1.7
Platform:  i586
OS:RH Linux 6.2 / 2.2.16
Database:  MySQL 9.38 / 3.22.32
Statistics:

Perl
Severity: cosmetic. 

The search page reports results +1. E.g. if 20 results per page are requested, the 
caption on the results page will say "Displaying documents 1-21 of xxx found". If less 
than one page worth of results are found, the caption will increase them by 1, e.g. 
"Displaying documents 1-4 of 3 found".

Similarly, in the bottom of the first results page, links appear to subsequent pages 
even when all the results fit in the first page. E.g., if only three results have been 
returned, there will still be a link like "<< Previous 1 2 3 Next >>" at the bottom of 
the page, where "Previous" and "Next" are dead, but 2 and 3 are live and point to a 
page containing the last result of the three already shown. 

For a live example go to http://194.109.240.22/cgi-bin/search.cgi and search for 
"bronson". 

Regards,
Z

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: Period directive?

2000-10-29 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
The instructions in the indexer.conf-dist file say that M is minute and m is month. 
However, the examples given right after the instructions indicate the opposite. Which 
is correct?

Reply: <http://search.mnogo.ru/board/message.php?id=611>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: New message on the WebBoard #1: udmsearch.robots not found ?!

2000-10-27 Thread Zenon Panoussis

Author: Zenon Panoussis
Email: [EMAIL PROTECTED]
Message:
I just installed 3.1.7 on Linux 2.2.16 with:
./configure --with-mysql  (3.22.32)
make
make install
and no changes in the configuration file.

I proceeded to create one database and tables with:
mysqladmin create udmsearch
mysql udmsearch < multi.txt
Do I understand the INSTALL instructions correctly in that multi.txt replaces both 
create.txt and all stop.lang.txt files? 

I edited indexer.conf minimally and left Robots at the default yes. 

Running indexer fails with the following error:
Indexer[15034]: indexer from UdmSearch v.3.1.7/MySQL started with 
'/usr/local/udmsearch/etc/indexer.conf'
Indexer[15034]: [1] Error: '#1146: Table 'udmsearch.robots' doesn't exist'

Changing to "Robots no" in indexer.conf doesn't help.

I grepped the entire documentation for 'udmsearch.robots' and found nothing. Thus, I 
have no idea what the udmsearch.robots table needs to look like and how to create it.

Does anyone know what's wrong and how it can be fixed?

Regards,
Z

Reply: <http://search.mnogo.ru/board/message.php?id=602>

__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]