://www.hathitrust.org/large_scale_search and our blog:
http://www.hathitrust.org/blogs/large-scale-search
http://www.hathitrust.org/blogs/large-scale-search (I'll be updating the
blog with details of current hardware and performance tests in the next week
or so)
Tom
Tom Burton-West
Digital Library
You might try a couple tests in the Solr admin interface to make sure the
query is being processed the same in both Solr and raw lucene.
1) use the analysis panel to determine if the Solr filter chain is doing
something unexpected compared to your lucene filter chain
2) try running a debug
Hello,
I think you are confusing the size of the data you want to index with the
size of the index. For our indexes (large full text documents) the Solr
index is about 1/3 of the size of the documents being indexed. For 3 TB of
data you might have an index of 1 TB or less. This depends on
Hi Norberto,
After working a bit on trying to port the Nutch CommonGrams code, I ran into
lots of dependencies on Nutch and Hadoop. Would it be possible to get more
information on how you use shingles (or code)? Are you creating shingles for
all two word combinations or using a list of words?
101 - 104 of 104 matches
Mail list logo