Re: [htdig] htdig 3.1.5 sharing databases between Solaris SPARC v7 and

Gilles Detillieux Thu, 22 Mar 2001 13:59:36 -0800
According to Robert Sink:
> Since the overhead of each web server indexing each of the remaining
> sites is so high we decided that one of the machines would index all
> the others and would then distribute out the databases to the other
> machines.
> 
> Between the SPARC Solaris machines this works beautifully.  However
> when trying to put the same databases on v3.1.5 Linux x86 (Debian in
> this case) we end up with very odd results and database errors. (all
> gcc 2.9.x compiled and linked internally with the 2.6.4 Sleepycat DB)
> 
> I realize that v3.2.0b1-3 offer the htdb_dump/load, however we cannot
> get 3.2.0b1-3 to operate correctly on any of the machines (SPARC or
> Linux) therefore that is not an option at this juncture.
> 
> I've tried simply using the db_load/dump that comes with the
> respective db package inside of the htdig 3.1.5 source tree and while
> I am able to dump/load successfully htmerge gives the same errors as
> if the database had been simply copied from one of the SPARC machines
> w/o dumping the output to ASCII.

db_dump and db_load (as well as htdb_dump/load in 3.2) just dump and
load raw DB records, without doing any decoded/encoding of the data these
records contain.  This is very different from htdump and htload, also in
3.2, which use ASCII representations for all the data ht://Dig stores
in its DB records.  The db_dump records are not portable because they
still contain binary data.

> I was wondering if you could offer any guidance on how to successfully
> export these databases between SPARC, and Linux (little/big ENDIAN).
> 
> Alternatively we are aware of having one machine as the search host
> and simply referencing the htsearch via URL from different machines,
> however due to our high level of autonomy and differing quality of
> internet service this is not an option and must have at least one live
> database at each site.

Well, you could run two digs, on on Linux and one on Solaris, then copy
the little-endian data to all the other Linux systems, and the big-endian
data to all the other Solaris systems.

The only other option I can suggest would be to write a simple tool
that dumps all the db.docdb records in a portable (ASCII) format,
and another that makes a new db.docdb from this format (i.e. your own
htdump and htload for 3.1.5).  This may not be as nasty as it sounds.
You could use htnotify as the skeleton for your htdump, as it does a
traversal of db.docdb, and just write out every field you get from the
DocumentRef record.  The corresponding htload utility might be a little
more work, but still manageable.  It would need to rebuild a DocumentRef
record, for the reconstructed db.docdb file, as well as a record for
db.docs.index, to map DocIDs to URLs.  You really only need to copy over
an ASCII form of db.docdb, as the db.wordlist file is already ASCII data.
Then, on the target system, run htmerge on the reconstructed database,
to get the new db.words.db file.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html
Re: [htdig] htdig 3.1.5 sharing databases between Solaris SPARC v7 and

Reply via email to