[htdig] Antwort: Re: [htdig][htdig] DB2 problem ...: missing or empty key value specified
Hi Geoff Thanx for the quick answer. You're assumption is right (Shouldn't htdig understand a bit javascript??) Thanx, Ruedi ___ The file: HTML SCRIPT LANGUAGE="JavaScript" !-- if (navigator.userAgent.indexOf("MSIE")!=-1) { window.location.replace("framems.html"); } else {window.location.replace("framenav.html"); } //-- /SCRIPT /HTML ___ Output from /opt/www/htdig/bin/htdig - -c /opt/www/htdig/conf/susedig.conf -i: 1:0:http://linux01.hasler.ascom.ch/ New server: linux01.hasler.ascom.ch, 80 Retrieval command for http://linux01.hasler.ascom.ch/robots.txt: GET /robots.txt HTTP/1.0 User-Agent: htdig/3.1.5 ([EMAIL PROTECTED]) Host: linux01.hasler.ascom.ch Header line: HTTP/1.1 200 OK Header line: Date: Fri, 08 Sep 2000 06:13:32 GMT Header line: Server: Apache/1.3.6 (Unix) (SuSE/Linux) mod_perl/1.19 PHP/3.0.11 Header line: Last-Modified: Wed, 06 Sep 2000 11:03:24 GMT Translated Wed, 06 Sep 2000 11:03:24 GMT to 2000-09-06 11:03:24 (100) And converted to Wed, 06 Sep 2000 11:03:24 Header line: ETag: "22c80b-9d-39b6247c" Header line: Accept-Ranges: bytes Header line: Content-Length: 157 Header line: Connection: close Header line: Content-Type: text/plain Header line: returnStatus = 0 Read 157 from document Read a total of 157 bytes Parsing robots.txt file using myname = susedig Robots.txt line: # exclude help system from robots Robots.txt line: User-agent: * Found 'user-agent' line: * Robots.txt line: Disallow: /hilfe/ /manual/ /support-db/ /gif/ Found 'disallow' line: /hilfe/ /manual/ /support-db/ /gif/ Robots.txt line: # but allow htdig to index our doc-tree Robots.txt line: User-agent: susedig Found 'user-agent' line: susedig Pattern: pushed pick: linux01.hasler.ascom.ch, # servers = 1 0:0:0:http://linux01.hasler.ascom.ch/: Retrieval command for http://linux01.hasler.ascom.ch/: GET / HTTP/1.0 User-Agent: htdig/3.1.5 ([EMAIL PROTECTED]) Host: linux01.hasler.ascom.ch Header line: HTTP/1.1 200 OK Header line: Date: Fri, 08 Sep 2000 06:13:32 GMT Header line: Server: Apache/1.3.6 (Unix) (SuSE/Linux) mod_perl/1.19 PHP/3.0.11 Header line: Last-Modified: Fri, 25 Aug 2000 09:51:54 GMT Translated Fri, 25 Aug 2000 09:51:54 GMT to 2000-08-25 09:51:54 (100) And converted to Fri, 25 Aug 2000 09:51:54 Header line: ETag: "22c9b1-1eb-39a641ba" Header line: Accept-Ranges: bytes Header line: Content-Length: 491 Header line: Connection: close Header line: Content-Type: text/html Header line: returnStatus = 0 Read 491 from document Read a total of 491 bytes Tag: HTML, matched -1 Tag: HEAD, matched -1 Tag: TITLE, matched 0 Tag: /TITLE, matched 1 title: Tag: META NAME="GENERATOR" CONTENT="StarOffice/5.1 (Linux)", matched 20 Tag: META NAME="AUTHOR" CONTENT="Ruedi Hofer", matched 20 Tag: META NAME="CREATED" CONTENT="2806;19565200", matched 20 Tag: META NAME="CHANGED" CONTENT="16010101;0", matched 20 Tag: /HEAD, matched -1 Tag: body bgcolor="#FF", matched -1 Tag: SCRIPT LANGUAGE="JavaScript", matched -1 Tag: /SCRIPT, matched -1 Tag: /body, matched -1 Tag: /HTML, matched -1 size = 491 pick: linux01.hasler.ascom.ch, # servers = 1 htdig: Run complete htdig: 1 server seen: htdig: linux01.hasler.ascom.ch:80 1 document [EMAIL PROTECTED] on 07.09.2000 18:14:57 An: [EMAIL PROTECTED] @ MailGate Kopie: [EMAIL PROTECTED] @ MailGate Thema: Re: [htdig] "[htdig] DB2 problem ...: missing or empty key value specified" On Thu, 7 Sep 2000, Hofer Ruedi wrote: linux01:~ # /opt/www/htdig/bin/htmerge - -s -c /opt/www/htdig/conf/susedig.conf htmerge: Sorting... DB2 problem...: missing or empty key value specified htmerge: Total word count: 0 0/http://linux01/ The error message is a bit cryptic, in part because htmerge doesn't do any error checking for empty databases. Your problem is that nothing was indexed--see the "total word count 0" bit? So you can solve your problem by seeing why htdig didn't index anything. Try running htdig - and taking a look at the HTTP messages among other things. -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] Antwort: Re: [htdig][htdig] DB2 problem ...: missing or empty key value specified
[EMAIL PROTECTED] wrote: Hi Geoff Thanx for the quick answer. You're assumption is right (Shouldn't htdig understand a bit javascript??) Thanx, Ruedi ___ The file: HTML SCRIPT LANGUAGE="JavaScript" !-- if (navigator.userAgent.indexOf("MSIE")!=-1) { window.location.replace("framems.html"); } else {window.location.replace("framenav.html"); } //-- /SCRIPT /HTML No. One simple reason: An indexer shall not interpret any client side code to avoid getting stuck in endless loops. Another reason you give in your example: Since Ht://Dig is by no means a graphical browser, there is no "window". You can still follow that link by specifying a LINK REL="start" HREF="noframes.html" in the HTML preamble of the document. hth, Torsten -- InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH Waldhofstraße 14Tel: +49-4101-403605 D-25474 EllerbekFax: +49-4101-403606 E-Mail: [EMAIL PROTECTED]Internet: http://www.inwise.de To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] Re: Antwort: Re: [htdig] [htdig] DB2 problem ...: missing orempty key value specified
At 1:07 PM +0100 9/8/00, [EMAIL PROTECTED] wrote: Thanx for the quick answer. You're assumption is right (Shouldn't htdig understand a bit javascript??) Only enough to ignore it. There is absolutely no reason for htdig to parse JavaScript or Java or... For one, it would contribute significantly to code bloat. For another, there still wouldn't be a way to work out exactly what links are supposed to be indexed--we'd have to anticipate user action, etc. Better to just ignore it. -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
[htdig] [htdig] DB2 problem ...: missing or empty key value specified
Hi Has someone solved the problem below? Cheers, Ruedi linux01:~ # /opt/www/htdig/bin/htmerge - -s -c /opt/www/htdig/conf/susedig.conf htmerge: Sorting... DB2 problem...: missing or empty key value specified htmerge: Total word count: 0 0/http://linux01/ htmerge: Total documents: 1 htmerge: Total doc db size (in K): 0 linux01:~ # I guess the mails below were something similar, but there was no solution afaik. - I have downloaded the lastest tar-file from htdig.org. Unpacked it and configured it and installed it (using the usual make and make install). After running htdig - no error msgs After running htmerge - -s i get this: htmerge: sorting htmerge: Removing doc #1 DB2 problem ...: missing or empty key value specified htmerge: Total word count: 0 Deleted, no excerpt: 1 /http://localhost/home-john htmerge: Total documents: 0 htmerge: Total doc db size in (k): 0 The system on that HtDig should run is newly installed (no uncontroled Compiler-Changes) I got no errors during the make install or make process. Hope you find the source of the problem very soon ;-), maybe you could point me into the right direction because i could have a look at the sources too Regards Bernhard Gilles Detillieux wrote: According to Bernhard Schindlholzer: I'm using HtDig 3.2.1 on Suse Linux 6.1 After installing i try to run the rundig-script and this is what i get: htdig: Run complete htdig: 1 server seen htdig: localhost 80: 1 document DB2 problem ...: missing or empty key value specified htmerge: Total documents: 0 htmerge: Total doc db size in (k): 0 But there are files in the root directory of my webserver Any ideas? We've gotten reports of this error on a few occasions before, but have never managed to follow it up thoroughly enough to nail down the cause of it. The error suggests a corrupt database, but if it fails the first time you run it, there's got to be more to it than that. First of all, did you install ht://Dig from the binary RPMs, or did you build it from the source? If you installed a binary RPM, I'd first recommend that you rebuild it from the source on your system, to rule out library incompatibilities or something of the sort. Next, you should run htdig and htmerge with - -s, to see at which point the error occurs, and what led up to it. -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word unsubscribe in the SUBJECT of the message. To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word unsubscribe in the SUBJECT of the message. To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] [htdig] DB2 problem ...: missing or empty key valuespecified
On Thu, 7 Sep 2000, Hofer Ruedi wrote: linux01:~ # /opt/www/htdig/bin/htmerge - -s -c /opt/www/htdig/conf/susedig.conf htmerge: Sorting... DB2 problem...: missing or empty key value specified htmerge: Total word count: 0 0/http://linux01/ The error message is a bit cryptic, in part because htmerge doesn't do any error checking for empty databases. Your problem is that nothing was indexed--see the "total word count 0" bit? So you can solve your problem by seeing why htdig didn't index anything. Try running htdig - and taking a look at the HTTP messages among other things. -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this. List archives: http://www.htdig.org/mail/menu.html FAQ:http://www.htdig.org/FAQ.html
Re: [htdig] DB2 problem...: missing or empty key value specified
On Thu, 27 Jul 2000, Gilles Detillieux wrote: I've never seen this message as being connected to the removal of a document before, but I guess when you completely empty out the database, that would do it. I don't know what, if anything, can be done by htmerge to prevent this. There needs to be a test in htmerge to exit gracefully if there are no documents in the database. An error message of some sort is probably warranted. Perhaps all of the utilities should do this test and give an error message. It would certainly stop the following. (In 3.2, htmerge no longer works the same way, but I think there still needs to be better error checking. I believe I put this in the Now this is a problem. I guess it's not critical if it only happens when the word database is completely empty, but htfuzzy should be robust enough to handle this without crashing. I'm surprised at this. I think I missed which part of htfuzzy was being used. What was the exact call to htfuzzy? -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this.
Re: [htdig] DB2 problem...: __db_mpool.share: Permission denied
According to Uta Becht: I installed htdig. DB-Files are ok. When I try and search I get the error: htsearch detected an error: Internal Server Error The server encountered an internal error or misconfiguration and was unable to complete your request. Please contact the server administrator, [EMAIL PROTECTED] and inform them of the time the error occurred, and anything you might have done that may have caused the error. The apache error log says: DB2 problem...: __db_mpool.share: Permission denied DB2 problem...: db_appinit: Permission denied [Thu Jul 13 11:18:27 2000] [error] Premature end of script headers: /netsite/online/cgi-bin/medakt/htsearch Well, as the error messages indicate, there seems to be a problem with permissions. Check to make sure all your db files are readable, and the directories leading up to them are executable, by the user ID under which apache runs. I'd much prefer that questions of this sort went to the list rather than just to me, so others could benefit from it as well (and so others would have a chance to reply). For this reason, I'm cc'ing the list. -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this.
Re: [htdig] DB2 problem
According to Sheri Campbell: I am currently running a dig and getting the error message DB2 problem:.../my/dir/db.docdb.db page 97950 doesn't exist, create flag not set. Also, when I started the dig and the Db's were being copied over, I got the message that db.docdb.db had an input/output error. I don't know exactly what is wrong or how to fix the problem but I am assuming that the error messages are related. I'm not quite sure what you mean by "the Db's were being copied over". Were you trying to copy the databases from the indexing system to another before htdig and htmerge had completed, or overwriting the databases on the indexing system while htdig was running? An I/O error while writing a database can most certainly mess things up. In any case, it sounds like a corrupt database, so the standard remedy is to reindex from scratch. Either run the rundig script, or run "htdig -i" followed by htmerge, and don't do anything with or to the databases until they're completely rebuilt. Make sure you have plenty of disk space available before you begin. If you still get the errors after that, or during the rebuild, please post a message to the list again, and hopefully one of the developers more knowledgeable in the Berkeley DB package can work with you to get to the bottom of this. -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this.
[htdig] DB2 problem...: missing or empty key value specified
Hi, Has someone encountered the error message above yet ? It keeps coming up when I attempt to run htmerge... Thanks for any hint, Seb. To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this.
Re: [htdig] DB2 problem...: /var/lib/htdig/common/synonyms.db: Permission denied
According to Fates: I install the rpm for htdig. When I ran rundig I received an error about DB2 problem When I try and search I get the error: htsearch detected an error. Please report this to the webmaster of this site. The error message is: Unable to read word database file '/var/lib/htdig/db/db.words.db' The apache error log says: Did you run htmerge? DB2 problem...: /var/lib/htdig/common/synonyms.db: Permission denied DB2 problem...: /var/lib/htdig/common/word2root.db: Permission denied Just as for the problem with your config file, you need to make all files in your "common" directory, and your "db" directory, world-readable (e.g. 644 mode). -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this.
[htdig] DB2 problem...: /var/lib/htdig/common/synonyms.db: Permission denied
I install the rpm for htdig. When I ran rundig I received an error about DB2 problem When I try and search I get the error: htsearch detected an error. Please report this to the webmaster of this site. The error message is: Unable to read word database file '/var/lib/htdig/db/db.words.db' The apache error log says: Did you run htmerge? DB2 problem...: /var/lib/htdig/common/synonyms.db: Permission denied DB2 problem...: /var/lib/htdig/common/word2root.db: Permission denied To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this.
[htdig] DB2 problem...: /db.words.db: page 0: reference count overflow
I'm sure it's a regular question but having built my DB (at just over 0.5 gig) I keep getting: DB2 problem...: /u/www/virtual/ijack/db/db.words.db: page 0: reference count overflow on Linux Redhat 5.2 Am I simply at the threshold of my memory (128megs) or is there something I can do, because I'd really like my DB to be able to grow substantially. Thanks Sacha Wheeler Director Thought Interactive To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word unsubscribe in the SUBJECT of the message.
Re: [htdig] DB2 problem ...: missing or empty key value specified
redirect: http://Eagle.Underground/home-john Rejected URL not in the limits! pick: localhost, #servers = 1 Yes, this is the problem. The server is attempting to redirect you to another server (in this case "Eagle.Underground"). Since your limit_urls_to is likely set to ${start_url} or http://localhost/, this doesn't match and nothing is indexed at all. I'd either change the server config to *not* return the redirect, or I'd add the other name to limit_urls_to. -Geoff Hutchison Williams Students Online http://wso.williams.edu/ To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word unsubscribe in the SUBJECT of the message.
Re: [htdig] DB2 problem ...: missing or empty key value specified
According to Bernhard Schindlholzer: I have downloaded the lastest tar-file from htdig.org. Unpacked it and configured it and installed it (using the usual make and make install). After running htdig - no error msgs After running htmerge - -s i get this: htmerge: sorting htmerge: Removing doc #1 DB2 problem ...: missing or empty key value specified htmerge: Total word count: 0 Deleted, no excerpt: 1 /http://localhost/home-john htmerge: Total documents: 0 htmerge: Total doc db size in (k): 0 Earlier that day... According to Bernhard Schindlholzer: After installing i try to run the rundig-script and this is what i get: htdig: Run complete htdig: 1 server seen htdig:localhost 80: 1 document DB2 problem ...: missing or empty key value specified htmerge: Total documents: 0 htmerge: Total doc db size in (k): 0 But there are files in the root directory of my webserver So it seems htdig is only picking up one single document from your database, and htmerge is removing it. Hmm. I'd be interested in knowing what that one document is, and why htmerge deletes it. You say that there are files in the root directory of your server, but htdig isn't picking them up. Surely, there must be some clues about this in the htdig - output? I wonder if the missing or empty key value specified error happens because there's nothing left in the database, or because of something wierd that htdig puts into it. -- Gilles R. Detillieux E-mail: [EMAIL PROTECTED] Spinal Cord Research Centre WWW:http://www.scrc.umanitoba.ca/~grdetil Dept. Physiology, U. of Manitoba Phone: (204)789-3766 Winnipeg, MB R3E 3J7 (Canada) Fax:(204)789-3930 To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word unsubscribe in the SUBJECT of the message.
1998-11: Re: htdig: DB2 problem on AIX 4.1.5 - unexpeceted file [#371]
Re: http://www.htdig.org/mail/1998-11/0256.html Thank you for your email! I've reviewed it, as well as the thread of email. I'm a little concerned about the suggested change. I can't understand why the DB_CREATE flag would cause an "unexpected file format" message. To the best of my recollection, the DB_CREATE flag is only used to turn on the O_CREAT flag on UNIX systems, so I don't see how the error and the fix could be related. Does htdig turn on the Berkeley DB additional error message output? If not, that might be worthwhile. It doesn't have any performance impact (unlike --enable-diagnostic), and may help shed some light on the failure. Also, it might be worthwhile to use a debugger to track through the DB open code and determine what error is being returned and why the failure is happening in this case. We don't run on AIX locally I'm afraid, so we can't do this test ourselves. For future reference, your request is #371. Regards, Amy Adams =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Amy Adams Berkeley DB Product Manager Sleepycat Software Inc. [EMAIL PROTECTED] 394 E. Riding Dr. +1-617-633-2429 Carlisle, MA 01741 http://www.sleepycat.com -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
htdig: DB2 problem: unable to create/retrieve page
I continue to get the following error, even though I've moved from a Solaris 2.5.1 machine to 2.6 (2.6 doesn't have a 2Gb file size limitation): DB2 problem...: /home/challenger/bigler/btcweb/db.docdb: write failed for page 2097151 DB2 problem...: unable to create/retrieve page 1530502 can anyone shed any light on this, and what I can do about it? Many thanks! Tyson --- M. Tyson Bigler SEPTCo Computing Solutions Group Infrastructure Support Bellaire Technology Center [EMAIL PROTECTED] 3737 Bellaire Blvd., Room 1007B 713-245-7476 Houston, TX 77025 -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem: unable to create/retrieve page
At 7:31 PM -0500 12/2/98, Tyson Bigler wrote: DB2 problem...: /home/challenger/bigler/btcweb/db.docdb: write failed for page 2097151 DB2 problem...: unable to create/retrieve page 1530502 can anyone shed any light on this, and what I can do about it? If you're using databases created from scratch, I'd look at how much memory you have. It's a big database after all, and if the server runs out of memory, it might not be able to read or write to the database. -Geoff Hutchison Williams Students Online http://wso.williams.edu/ -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem on AIX 4.1.5 - unexpeceted file format
At 5:31 AM -0500 11/5/98, Alexander Bergolth wrote: I had the same problem, but I think it's a bug in the Berkeley DB library. (Maybe only on some AIX machines?) In my case the problem occurs when opening some existing database files with db_open having set the DB_CREATE flag. Without DB_CREATE the database can be opened. Did you report this to SleepyCat? I applied the following dirty hack on DB2_db::OpenReadWrite (in DB2_db.cc): // Create the database. // // LEO strange error opening existing database files //if ((errno = db_open(filename, DB_BTREE, DB_CREATE, mode, dbenv, // dbinfo, dbp)) == 0) if (access(filename, F_OK) == 0) errno = db_open(filename, DB_BTREE, 0, 0, dbenv, dbinfo, dbp); else errno = db_open(filename, DB_BTREE, DB_CREATE, mode, dbenv, dbinfo, dbp); if (errno == 0) // /LEO Perhaps we should have a configure check for AIX and use this code? -Geoff Hutchison Williams Students Online http://wso.williams.edu/ -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem
At 6:10 AM -0500 11/24/98, U.O. Telematica Municipale - Comune di Prato wrote: Hi folks, I'm Gabriele from Italy. While running htmerge I encountered this error. What about it? I would wonder if it's a memory problem. Currently htmerge uses a lot of memory, and this can cause problems with the DB code. And I found the same problem as George Adams's. I must erase the database dir to make the program work right. Why? Well, it could be a bug in htmerge/words.cc. I can't see why a document would disappear from the doc index and not the word index, but it would require some debugging. (The words.cc code should see the document marked by htdig/Retriever.cc and remove it, just like the docs.cc code.) -Geoff Hutchison Williams Students Online http://wso.williams.edu/ -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
htdig: DB2 problem
Am I right in assuming that running "htdig -i -v -s" isn't creating a temporary set of databases and then writing them to the db directory? Because if it did, I'ld need over 500MB free on the hard disk, and I wouldn't have that much space free. Any ideas appreciated. It's just an idea, maybe someone knows better: as far as I know, htdig isn't indeed creating _explicitly_ other files than what you see. But I can imagine that some kind of sorting will need external files - and these could be larger than you would expect. Can you maybe just give-it a try, freeing a larger part of your /tmp or so during indexing...? Iosif Fettich -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem
Thank you all, it was hard disk space. Just thought 167MB free after running rundig meant I had enough space. Finally ran with no errors with 600MB free. Total db directory size, 311MB. So, set to index all of every page on my site, the db directory is about 6.5% larger than all of the files indexed. Now I'll have to try using contrib/wordfreq/ or Geoff's method. I assume halving the database size would not only save disk space, but speed searches. used "cut -f 1 db.wordlist | uniq -c | sort -r" to determine how many documents each word was in, then I took the top 500 and edited the list. Edited db.worklist, I assume? Thanks again, Jeff Hill Iosif Fettich wrote: It's just an idea, maybe someone knows better: as far as I know, htdig isn't indeed creating _explicitly_ other files than what you see. But I can imagine that some kind of sorting will need external files - and these could be larger than you would expect. Can you maybe just give-it a try, freeing a larger part of your /tmp or so during indexing...? * HR On-Line: The Network for Workplace Issues ** Ph:416-604-7251 -- Fax:416-604-4708 ** http://www.hronline.com ** -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem
At 8:18 AM -0500 11/17/98, Jeff Hill wrote: used "cut -f 1 db.wordlist | uniq -c | sort -r" to determine how many documents each word was in, then I took the top 500 and edited the list. Edited db.worklist, I assume? No, the bad_word list. I wouldn't suggest editing db.wordlist, it probably wouldn't have good results. Basically, I redirected the output of the commands into a file, edited it and added it on to the bad_words list. -Geoff Hutchison Williams Students Online http://wso.williams.edu/ -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem
Iosif Fettich wrote: DB2 problem...: /usr/local/db/db.words.db: write failed for page 61819 DB2 problem...: unable to create/retrieve page 618 from running rundig Are you sure you have enough disk space ? Positive . . . er, I should say, I don't think so -- not unless htdig is creating some temp files I don't know about and erasing them when it fails. After htdig runs and fails as above, the partition still has 167MB free. Seems sufficient, however, htdig fails after creating over 250MB of db files: -rw-r--r-- 1 root root 101706752 Nov 14 23:47 db.docdb -rw-r--r-- 1 root root 5036032 Nov 14 23:47 db.docs.index -rw-r--r-- 1 root root 92116335 Nov 14 23:36 db.wordlist -rw-r--r-- 1 root root 63323136 Nov 14 23:36 db.words.db This seems larger than it used to be. As far as space details go, I've also got a 130MB swap partition and 94MB RAM (running on a P150). The databases that are created are functional, although they obviously have problems. Regards, Jeff H. I. Fettich -- * HR On-Line: The Network for Workplace Issues ** Ph:416-604-7251 -- Fax:416-604-4708 ** http://www.hronline.com ** -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem
Iosif Fettich wrote: what's the total size of what you're indexing ? 292MB, I believe. I can't remember if the "du" command works exactly write on Linux, seems like there used to be a problem -- anyway, "du -cks" reports "292359 total", so I'll assume. This seems larger than it used to be. Significantly different ? I'm not sure anymore: did you say in the last message that you're using 3.1.0b2 ? I can't remember, but it seems larger by 50MB or so (could be we just keep adding so much). I am, however, running htdig-3.1.0b2, installed Nov. 6. If that gives a clue: indexing here about 5000 html documents (approx. 25 MB) generates something like -rw-r--r-- 1 root root 7284736 Nov 16 03:05 db.docdb -rw-r--r-- 1 root root 550912 Nov 16 03:05 db.docs.index -rw-r--r-- 1 root root 9905263 Nov 16 03:05 db.wordlist -rw-r--r-- 1 root root 9511936 Nov 16 03:05 db.words.db So, your dbs are actually slightly larger than your document base? Well, if htdig didn't fail, I suppose mine might be slightly larger too, although it should still have enough space. Am I right in assuming that running "htdig -i -v -s" isn't creating a temporary set of databases and then writing them to the db directory? Because if it did, I'ld need over 500MB free on the hard disk, and I wouldn't have that much space free. Any ideas appreciated. It's true, with a badwords list where I put in all meaningless words I was able to spot using contrib/wordfreq/. That almost halved database size. I'll have to take a look at that, thanks. Jeff H. * HR On-Line: The Network for Workplace Issues ** Ph:416-604-7251 -- Fax:416-604-4708 ** http://www.hronline.com ** -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem
At 3:09 PM -0500 11/16/98, Jeff Hill wrote: I can't remember, but it seems larger by 50MB or so (could be we just keep adding so much). I am, however, running htdig-3.1.0b2, installed Nov. 6. Two possibilities for larger DB: 1) You're adding more (I have several mailing list archives that grow exponentially). 2) The DB bug was hiding the actual size of your data. So, your dbs are actually slightly larger than your document base? Well, if htdig didn't fail, I suppose mine might be slightly larger too, although it should still have enough space. This depends significantly on the max_head_length you use (i.e. the size of the excerpts you store). When I get pinched for disk space, I cut this down. Am I right in assuming that running "htdig -i -v -s" isn't creating a temporary set of databases and then writing them to the db directory? Because if it did, I'ld need over 500MB free on the hard disk, and I wouldn't have that much space free. I don't believe htdig does this. On the other hand, htmerge uses temporary sets plus sort files. :-( It's true, with a badwords list where I put in all meaningless words I was able to spot using contrib/wordfreq/. That almost halved database size. I'll have to take a look at that, thanks. I can't attest to halving, but it does help. I didn't use wordfreq, but I used "cut -f 1 db.wordlist | uniq -c | sort -r" to determine how many documents each word was in, then I took the top 500 and edited the list. -Geoff Hutchison Williams Students Online http://wso.williams.edu/ -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
htdig: DB2 problem
Well, ontop of my other problems, I'm now getting a message: DB2 problem...: /usr/local/db/db.words.db: write failed for page 61819 DB2 problem...: unable to create/retrieve page 618 from running rundig I've tried a couple of times now to dig, and gotten the same error message, even though I've completely replaced the db files in /db Any suggestions appreciated, Jeff Hill -- * HR On-Line: The Network for Workplace Issues ** Ph:416-604-7251 -- Fax:416-604-4708 ** http://www.hronline.com ** -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem on AIX 4.1.5 - unexpeceted file format
Hi Alexander et al, This DBM information I could not find in this release. What I did find was that by default, it was using -O3 in it's DB compilations on AIX. I changed that back to -O2 and my last htmerge was successfull. At this time, having changed nothing else except adding the --debug request of Mr. Htdig :), I am being led to believe that the -O3 to gcc was the problem. I'll report back if I find it failing again. Hope this helps some others too, JES On Thu, 5 Nov 1998, Alexander Bergolth wrote: Hi! At 22:30 04.11.98 , James B. MacLean wrote: With this new release (3.1..) on and AIX 4.1.5 box I am always getting : -- htmerge: Total word count: 117886 DB2 problem...: /usr/local/htdig/db/db.docdb: unexpected file format I had the same problem on AIX 4.2.1. (Did you use xlc?) On 1998/09/14 I posted the fix for my box to the ht://Dig list. Hope that helps! James B. MacLean[EMAIL PROTECTED] Department of Education http://www.ednet.ns.ca/~macleajb Nova Scotia, Canada B3M 4B2 -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem on AIX 4.1.5 - unexpeceted file format
Hi! At 22:30 04.11.98 , James B. MacLean wrote: With this new release (3.1..) on and AIX 4.1.5 box I am always getting : -- htmerge: Total word count: 117886 DB2 problem...: /usr/local/htdig/db/db.docdb: unexpected file format I had the same problem on AIX 4.2.1. (Did you use xlc?) On 1998/09/14 I posted the fix for my box to the ht://Dig list. Hope that helps! - snipp! - I had the same problem, but I think it's a bug in the Berkeley DB library. (Maybe only on some AIX machines?) In my case the problem occurs when opening some existing database files with db_open having set the DB_CREATE flag. Without DB_CREATE the database can be opened. I applied the following dirty hack on DB2_db::OpenReadWrite (in DB2_db.cc): // Create the database. // // LEO strange error opening existing database files //if ((errno = db_open(filename, DB_BTREE, DB_CREATE, mode, dbenv, // dbinfo, dbp)) == 0) if (access(filename, F_OK) == 0) errno = db_open(filename, DB_BTREE, 0, 0, dbenv, dbinfo, dbp); else errno = db_open(filename, DB_BTREE, DB_CREATE, mode, dbenv, dbinfo, dbp); if (errno == 0) // /LEO - snipp! - --- Alexander (Leo) Bergolth [EMAIL PROTECTED] WU-Wien - Zentrum fuer Informatikdienste http://leo.wu-wien.ac.at Info Center In a world without walls and fences, who needs windows and gates? -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
htdig: DB2 problem on AIX 4.1.5 - unexpeceted file format
Hi everyone, With this new release (3.1..) on and AIX 4.1.5 box I am always getting : -- htmerge: Total word count: 117886 DB2 problem...: /usr/local/htdig/db/db.docdb: unexpected file format htmerge: Total documents: 0 htmerge: Total doc db size (in K): 0 -- I do not believe it to be a space problem. Any suggestions as to where I should look? thanks, JES -- James B. MacLean[EMAIL PROTECTED] Department of Education http://www.ednet.ns.ca/~macleajb Nova Scotia, Canada B3M 4B2 -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem on AIX 4.1.5 - unexpeceted file format
At 4:30 PM -0500 11/4/98, James B. MacLean wrote: I do not believe it to be a space problem. Any suggestions as to where I should look? This may sound obvious, but you said "in this new release." Did you upgrade from 3.0.8b2? If so, did you forget to remove your old databses? If not, try removing the databases or running with '-a' to eliminate the possibility of database corruption. There are other possibilities (the DB code didn't compile correctly), but these are more likely. -Geoff Hutchison Williams Students Online http://wso.williams.edu/ -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem on AIX 4.1.5 - unexpeceted file format
On Wed, 4 Nov 1998, Geoff Hutchison wrote: At 4:30 PM -0500 11/4/98, James B. MacLean wrote: I do not believe it to be a space problem. Any suggestions as to where I should look? This may sound obvious, but you said "in this new release." Did you upgrade from 3.0.8b2? If so, did you forget to remove your old databses? Thanks for the quick response. Yes, this was an upgrade, but I removed the previous install (actually moved it to a far far away place :). After it blew up the first time, I checked the space again, deleted it all, and re-ran, still same error, same place :(. If not, try removing the databases or running with '-a' to eliminate the possibility of database corruption. I will run it again with a -a on both htdig and htmerge to check this suggestion out. There are other possibilities (the DB code didn't compile correctly), but these are more likely. It was the cleanest by far of the htdig installs I've done here on an AIX box. unsigned int - unsigned long in the htlib/Connection.cc which was noted in the changelog as fixed were the only changes I made... 71600128 Nov 4 11:50 db.docdb 2048 Nov 4 12:06 db.docs.index 129562285 Nov 4 12:06 db.wordlist 88920064 Nov 4 12:06 db.words.db Incase it helps :)... -Geoff Hutchison Williams Students Online http://wso.williams.edu/ many thanks, JES -- James B. MacLean[EMAIL PROTECTED] Department of Education http://www.ednet.ns.ca/~macleajb Nova Scotia, Canada B3M 4B2 -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem on AIX 4.1.5 - unexpeceted file format
At 4:46 PM -0800 11/4/98, Geoff Hutchison wrote: This may sound obvious, but you said "in this new release." Did you upgrade from 3.0.8b2? If so, did you forget to remove your old databses? Upgrade. I deleted the old databases. Completely fresh install. It wasn't the old databases causing the problem on my site. -- Chuq Von Rospach (Hockey fan? http://www.plaidworks.com/hockey/) Apple Mail List Gnome (mailto:[EMAIL PROTECTED]) Plaidworks Consulting (mailto:[EMAIL PROTECTED]) http://www.plaidworks.com/ + http://www.lists.apple.com/ -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.
Re: htdig: DB2 problem on AIX 4.1.5 - unexpeceted file format
At 8:20 PM -0500 11/4/98, Chuq Von Rospach wrote: At 4:46 PM -0800 11/4/98, Geoff Hutchison wrote: This may sound obvious, but you said "in this new release." Did you upgrade from 3.0.8b2? If so, did you forget to remove your old databses? Upgrade. I deleted the old databases. Completely fresh install. It wasn't the old databases causing the problem on my site. OK. Chuq, you said you ran out of time trying to debug it. Ah well... If someone would like to get to the bottom of this problem on AIX, try the following: ethel ~/htdig3 cd db-2.4.14/dist ethel ~/dist ./configure --enable-diagnostic [omitted] ethel ~/dist make clean; make [omitted] ethel ~/dist cd ../.. ethel ~/htdig3 make [omitted] This will turn on diagnostic code in the database library--it will slow things considerably, but it should help figure out what's going on. Run htdig as usual and see what stuff comes up from the DB code. -Geoff Hutchison Williams Students Online http://wso.williams.edu/ -- To unsubscribe from the htdig mailing list, send a message to [EMAIL PROTECTED] containing the single word "unsubscribe" in the body of the message.