The norms are modded so each norm value is stored as 4 byte instead of 1 byte, this modification is using more memory. But anyway the hw we are running on are 2x 8 cpu hp servers with 16 gig ram in each of them.
We are scaling the index on daterange (and the ranking is modified to sort by date) Ex: 2007-01 - 2007-06 > ix 1 2007-07 - 2007-12 > ix 2 Each index is hosted in a separate hosting application and we have a layer infront of all indexes to merge the results. So it is our own "search engine" we tried solr but I didn't really liked it + we had to do some modifications to support the business. Since we are running 32 bit OS (we are going to use 64 bit soon it will be interesting) on windows with pae each process can consume just 2-2.5gb memory so we are having a lot of indexes on the same machine. We did this some years ago and we run it 24/7 without any failure, indexing approx 200-300k new articles every day. Any way... We really need to find a good api / some one that knows how to add inverted searching to lucene. /Regards Marcus -----Ursprungligt meddelande----- Från: Mark Miller [mailto:[EMAIL PROTECTED] Skickat: den 16 januari 2008 19:14 Till: java-user@lucene.apache.org Ämne: Re: Inverted search / Search on profilenet Don't have any info to add, but out of curiosity, what kind of setup are you using to host the 300 mil archive? Is the index distributed? Single machine? Solr? Thanks, Mark On Jan 16, 2008 12:27 PM, Marcus Falk <[EMAIL PROTECTED]> wrote: > Hi again, > > > > Today we are hosting a 300 million large search index without any > problems in a lucene environment, with just some customization in the > lucene api for ranking etc... > > > > So we are really satisfied with lucene. > > > > We also have the demands to search with documents on profiles we are > currently using verity (autonomy) for this, where we store the profiles > in the index and are using the document as query. > > The verity api we are using seems to have some internal threading > problems (race conditions) so we need to find another way to perform > those kind of searches. > > > > Does anybody have any ideas of any api that could do this for us? Any > ideas on how lucene could be modified to do this kind of searches? > > > > The volumes are around 300k full length articles distributed some what > evenly over a 24h period on a 50 k profilenet. > > > > > > /Mvh > > Marcus > > > > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]