Re: Num docs, block join, and dupes?
We've seen this as well. Before we understood the cause, it seemed very bizarre that hitting different nodes would yield different numFound, as well as using different rows=N (since the proxying node only de-dupe the documents that are returned in the response). I think consistency and correctness should be clearly delineated. Of course we'd rather have consistently correct result, but failing that, I'd rather have consistently incorrect result rather than inconsistent results because otherwise it's even hard to debug, as was the case here. I think either the node hosting the shard should also do the de-duping, or no one should. It's strange that the proxying node decides to do some sketchy limited result set de-dupe. On Tue, Mar 10, 2015 at 9:09 AM, Timothy Potter thelabd...@gmail.com wrote: Before I open a JIRA, I wanted to put this out to solicit feedback on what I'm seeing and what Solr should be doing. So I've indexed the following 8 docs into a 2-shard collection (Solr 4.8'ish - internal custom branch roughly based on 4.8) ... notice that the 3 grand-children of 2-1 have dup'd keys: [ { id:1, name:parent, _childDocuments_:[ { id:1-1, name:child }, { id:1-2, name:child } ] }, { id:2, name:parent, _childDocuments_:[ { id:2-1, name:child, _childDocuments_:[ { id:2-1-1, name:grandchild }, { id:2-1-1, name:grandchild2 }, { id:2-1-1, name:grandchild3 } ] } ] } ] When I query this collection, using: http://localhost:8984/solr/blockjoin2_shard2_replica1/select?q=*%3A*wt=jsonindent=trueshards.info=truerows=10 I get: { responseHeader:{ status:0, QTime:9, params:{ indent:true, q:*:*, shards.info:true, wt:json, rows:10}}, shards.info:{ http://localhost:8984/solr/blockjoin2_shard1_replica1/|http://localhost:8985/solr/blockjoin2_shard1_replica2/ :{ numFound:3, maxScore:1.0, shardAddress: http://localhost:8984/solr/blockjoin2_shard1_replica1;, time:4}, http://localhost:8984/solr/blockjoin2_shard2_replica1/|http://localhost:8985/solr/blockjoin2_shard2_replica2/ :{ numFound:5, maxScore:1.0, shardAddress: http://localhost:8985/solr/blockjoin2_shard2_replica2;, time:4}}, response:{numFound:6,start:0,maxScore:1.0,docs:[ { id:1-1, name:child}, { id:1-2, name:child}, { id:1, name:parent, _version_:1495272401329455104}, { id:2-1-1, name:grandchild}, { id:2-1, name:child}, { id:2, name:parent, _version_:1495272401361960960}] }} So Solr has de-duped the results. If I execute this query against the shard that has the dupes (distrib=false): http://localhost:8984/solr/blockjoin2_shard2_replica1/select?q=*%3A*wt=jsonindent=trueshards.info=truerows=10distrib=false Then the dupes are returned: { responseHeader:{ status:0, QTime:0, params:{ indent:true, q:*:*, shards.info:true, distrib:false, wt:json, rows:10}}, response:{numFound:5,start:0,docs:[ { id:2-1-1, name:grandchild}, { id:2-1-1, name:grandchild2}, { id:2-1-1, name:grandchild3}, { id:2-1, name:child}, { id:2, name:parent, _version_:1495272401361960960}] }} So I guess my question is why doesn't the non-distrib query do de-duping? Mainly confirming this is how it's supposed to work and this behavior doesn't strike anyone else as odd ;-) Cheers, Tim
Re: Num docs, block join, and dupes?
On Tue, Mar 10, 2015 at 7:09 PM, Timothy Potter thelabd...@gmail.com wrote: So I guess my question is why doesn't the non-distrib query do de-duping? Tim, that's by design behavior. the special _root_ field is used as a delete term when a block update is applied i.e in case of block, uniqueKey is not used. see https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/update/DirectUpdateHandler2.java#L224 I agree that's one of the issues of the current block update implementation, but frankly speaking, I didn't consider it as an oddity. Do you? What do you want to achieve? -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: Num docs
Hmmm distributed BDB brrr :) On Fri, Jun 13, 2008 at 3:21 AM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Or, if you want to go with something older/more stable, go with BDB. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Thursday, June 12, 2008 3:17:52 PM Subject: Re: Num docs Cacti, Nagios you name it already in use :) Well I'm the CTO so the one really really interested in estimating perf. The id's come from a db initially and is later used for retrieval from a distributed on disk caching system which I have written. I'm in the process of moving from MySQL to HBase or Hypertable. /M On Tue, Jun 10, 2008 at 10:03 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Marcus, It sounds like you may just want to use a good server monitoring package that collects server data and prints out pretty charts. Then you can show them to your IT/budget people when the charts start showing increased query latency times, very little available RAM, swapping, high CPU usage and such. Nagios, Ganglia, any of those things will do. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou To: solr-user@lucene.apache.org Sent: Tuesday, June 10, 2008 3:29:40 PM Subject: Re: Num docs Well guys you are right... Still I want to have a clue about how much each machine stores to predict when we need more machines (measure performance degradation per new document). But it's harder to collect that kind of data. It sure is doable no doubt and is a normal sharding algo for MySQL. The best approach I think is to have some bg threads run X number of queries and collect the response times, throw away the n lowest/highest response times and calc an avg time which is used for in sharding and query lb'ing. Little off topic but interesting What would you guys say about a good correlation between the index size on disk (no stored text content) and available RAM and having good response times. How long is a rope would you perhaps say...but I think some rule of thumb could be established... One of the schemas of concern required=true / required=true / required=false / stored=false required=true / required=true / required=true / required=false / required=true / required=true / required=false / required=false multiValued=true/ required=false / required=false / required=false / required=false / and a normal solr query (taken from the log): /select start=0q=(title:(apple)^4+OR+description:(apple))version=2.2rows=15wt=xmlsort=publishDate+desc //Marcus On Tue, Jun 10, 2008 at 1:15 AM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Exactly. I think I mentioned this once before several months ago. One can take various hardware specs (# cores, CPU speed, FSB, RAM, etc.), performance numbers, etc. and come up with a number for each server's overall capacity. As a matter of fact, I think this would be useful to have right in Solr, primarily for use when allocating and sizing shards for Distributed Search. JIRA enhancement/feature issue? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Alexander Ramos Jardim To: solr-user@lucene.apache.org Sent: Monday, June 9, 2008 6:42:17 PM Subject: Re: Num docs I even think that such a decision should be based on the overall machine performance at a given time, and not the index size. Unless you are talking solely about HD space and not having any performance issues. 2008/6/7 Otis Gospodnetic : Marcus, For that you can rely on du, vmstat, iostat, top and such, too. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou To: solr-user@lucene.apache.org Sent: Saturday, June 7, 2008 12:33:10 PM Subject: Re: Num docs Thanks, I wanna ask the indices how much more each shard can handle before they're considered full and scream for a budget to get a new machine :) /M On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic wrote: Marcus, check out the Luke request handler. You can get it from its output. It may also be possible to get *just* that number, but I'm
Re: Num docs
Well guys you are right... Still I want to have a clue about how much each machine stores to predict when we need more machines (measure performance degradation per new document). But it's harder to collect that kind of data. It sure is doable no doubt and is a normal sharding algo for MySQL. The best approach I think is to have some bg threads run X number of queries and collect the response times, throw away the n lowest/highest response times and calc an avg time which is used for in sharding and query lb'ing. Little off topic but interesting What would you guys say about a good correlation between the index size on disk (no stored text content) and available RAM and having good response times. How long is a rope would you perhaps say...but I think some rule of thumb could be established... One of the schemas of concern fields field name=feedId type=integer indexed=true stored=false required=true / field name=feedItemId type=long indexed=true stored=true required=true / field name=siteId type=integer indexed=true stored=true required=false / field name=partnerType type=integer indexed=true stored=false required=true / field name=uid type=string indexed=true stored=false required=true / field name=link type=string indexed=true stored=false required=true / field name=description type=text indexed=true stored=false required=false / field name=title type=text indexed=true stored=false required=true / field name=publishDate type=date indexed=true stored=false required=true / field name=author type=string indexed=true stored=false required=false / field name=keyWordId type=integer indexed=true stored=false required=false multiValued=true/ field name=category type=integer indexed=true stored=false required=false / field name=language type=integer indexed=true stored=false required=false / field name=country type=integer indexed=true stored=false required=false / field name=ngramLang type=integer indexed=true stored=false required=false / /fields and a normal solr query (taken from the log): /select start=0q=(title:(apple)^4+OR+description:(apple))version=2.2rows=15wt=xmlsort=publishDate+desc //Marcus On Tue, Jun 10, 2008 at 1:15 AM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Exactly. I think I mentioned this once before several months ago. One can take various hardware specs (# cores, CPU speed, FSB, RAM, etc.), performance numbers, etc. and come up with a number for each server's overall capacity. As a matter of fact, I think this would be useful to have right in Solr, primarily for use when allocating and sizing shards for Distributed Search. JIRA enhancement/feature issue? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Alexander Ramos Jardim [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Monday, June 9, 2008 6:42:17 PM Subject: Re: Num docs I even think that such a decision should be based on the overall machine performance at a given time, and not the index size. Unless you are talking solely about HD space and not having any performance issues. 2008/6/7 Otis Gospodnetic : Marcus, For that you can rely on du, vmstat, iostat, top and such, too. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou To: solr-user@lucene.apache.org Sent: Saturday, June 7, 2008 12:33:10 PM Subject: Re: Num docs Thanks, I wanna ask the indices how much more each shard can handle before they're considered full and scream for a budget to get a new machine :) /M On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic wrote: Marcus, check out the Luke request handler. You can get it from its output. It may also be possible to get *just* that number, but I'm not looking at docs/code right now to know for sure. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou To: solr-user@lucene.apache.org Sent: Saturday, June 7, 2008 5:09:20 AM Subject: Num docs Hi. Is there a way of retrieve IndexWriter.numDocs() in SOLR ? Kindly //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/ -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/ -- Alexander Ramos Jardim -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/
Re: Num docs
Marcus, 2008/6/10 Marcus Herou [EMAIL PROTECTED]: Well guys you are right... Still I want to have a clue about how much each machine stores to predict when we need more machines (measure performance degradation per new document). But it's harder to collect that kind of data. It sure is doable no doubt and is a normal sharding algo for MySQL. Sorry, but I think performance degradation per new document isn't a good metric, for not saying a false one. You measure the cost in processing, memory and io writing/reading speed that Solr is developing and I can't see a way to get these informations based on your document quantity. Just figure that the same index with different usage policies and overall architecture can have a drastic or the system performance. The best approach I think is to have some bg threads run X number of queries and collect the response times, throw away the n lowest/highest response times and calc an avg time which is used for in sharding and query lb'ing. Sorry? Didn't get the point... Little off topic but interesting What would you guys say about a good correlation between the index size on disk (no stored text content) and available RAM and having good response times. I would need to benchmark a little more to answer you. How long is a rope would you perhaps say...but I think some rule of thumb could be established... We need to establish good metrics for establishing good rules. One of the schemas of concern fields field name=feedId type=integer indexed=true stored=false required=true / field name=feedItemId type=long indexed=true stored=true required=true / field name=siteId type=integer indexed=true stored=true required=false / field name=partnerType type=integer indexed=true stored=false required=true / field name=uid type=string indexed=true stored=false required=true / field name=link type=string indexed=true stored=false required=true / field name=description type=text indexed=true stored=false required=false / field name=title type=text indexed=true stored=false required=true / field name=publishDate type=date indexed=true stored=false required=true / field name=author type=string indexed=true stored=false required=false / field name=keyWordId type=integer indexed=true stored=false required=false multiValued=true/ field name=category type=integer indexed=true stored=false required=false / field name=language type=integer indexed=true stored=false required=false / field name=country type=integer indexed=true stored=false required=false / field name=ngramLang type=integer indexed=true stored=false required=false / /fields Let me ask you something: from where do you take all these id's? database? what about it's access times? and a normal solr query (taken from the log): /select start=0q=(title:(apple)^4+OR+description:(apple))version=2.2rows=15wt=xmlsort=publishDate+desc //Marcus On Tue, Jun 10, 2008 at 1:15 AM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Exactly. I think I mentioned this once before several months ago. One can take various hardware specs (# cores, CPU speed, FSB, RAM, etc.), performance numbers, etc. and come up with a number for each server's overall capacity. As a matter of fact, I think this would be useful to have right in Solr, primarily for use when allocating and sizing shards for Distributed Search. JIRA enhancement/feature issue? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Alexander Ramos Jardim [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Monday, June 9, 2008 6:42:17 PM Subject: Re: Num docs I even think that such a decision should be based on the overall machine performance at a given time, and not the index size. Unless you are talking solely about HD space and not having any performance issues. 2008/6/7 Otis Gospodnetic : Marcus, For that you can rely on du, vmstat, iostat, top and such, too. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou To: solr-user@lucene.apache.org Sent: Saturday, June 7, 2008 12:33:10 PM Subject: Re: Num docs Thanks, I wanna ask the indices how much more each shard can handle before they're considered full and scream for a budget to get a new machine :) /M On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic wrote: Marcus, check out the Luke request handler. You can get it from its output. It may also be possible to get *just* that number, but I'm not looking at docs/code right now to know for sure. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
Re: Num docs
Marcus, It sounds like you may just want to use a good server monitoring package that collects server data and prints out pretty charts. Then you can show them to your IT/budget people when the charts start showing increased query latency times, very little available RAM, swapping, high CPU usage and such. Nagios, Ganglia, any of those things will do. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Tuesday, June 10, 2008 3:29:40 PM Subject: Re: Num docs Well guys you are right... Still I want to have a clue about how much each machine stores to predict when we need more machines (measure performance degradation per new document). But it's harder to collect that kind of data. It sure is doable no doubt and is a normal sharding algo for MySQL. The best approach I think is to have some bg threads run X number of queries and collect the response times, throw away the n lowest/highest response times and calc an avg time which is used for in sharding and query lb'ing. Little off topic but interesting What would you guys say about a good correlation between the index size on disk (no stored text content) and available RAM and having good response times. How long is a rope would you perhaps say...but I think some rule of thumb could be established... One of the schemas of concern required=true / required=true / required=false / stored=false required=true / required=true / required=true / required=false / required=true / required=true / required=false / required=false multiValued=true/ required=false / required=false / required=false / required=false / and a normal solr query (taken from the log): /select start=0q=(title:(apple)^4+OR+description:(apple))version=2.2rows=15wt=xmlsort=publishDate+desc //Marcus On Tue, Jun 10, 2008 at 1:15 AM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Exactly. I think I mentioned this once before several months ago. One can take various hardware specs (# cores, CPU speed, FSB, RAM, etc.), performance numbers, etc. and come up with a number for each server's overall capacity. As a matter of fact, I think this would be useful to have right in Solr, primarily for use when allocating and sizing shards for Distributed Search. JIRA enhancement/feature issue? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Alexander Ramos Jardim To: solr-user@lucene.apache.org Sent: Monday, June 9, 2008 6:42:17 PM Subject: Re: Num docs I even think that such a decision should be based on the overall machine performance at a given time, and not the index size. Unless you are talking solely about HD space and not having any performance issues. 2008/6/7 Otis Gospodnetic : Marcus, For that you can rely on du, vmstat, iostat, top and such, too. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou To: solr-user@lucene.apache.org Sent: Saturday, June 7, 2008 12:33:10 PM Subject: Re: Num docs Thanks, I wanna ask the indices how much more each shard can handle before they're considered full and scream for a budget to get a new machine :) /M On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic wrote: Marcus, check out the Luke request handler. You can get it from its output. It may also be possible to get *just* that number, but I'm not looking at docs/code right now to know for sure. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou To: solr-user@lucene.apache.org Sent: Saturday, June 7, 2008 5:09:20 AM Subject: Num docs Hi. Is there a way of retrieve IndexWriter.numDocs() in SOLR ? Kindly //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/ -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/ -- Alexander Ramos Jardim -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/
Re: Num docs
Exactly. I think I mentioned this once before several months ago. One can take various hardware specs (# cores, CPU speed, FSB, RAM, etc.), performance numbers, etc. and come up with a number for each server's overall capacity. As a matter of fact, I think this would be useful to have right in Solr, primarily for use when allocating and sizing shards for Distributed Search. JIRA enhancement/feature issue? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Alexander Ramos Jardim [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Monday, June 9, 2008 6:42:17 PM Subject: Re: Num docs I even think that such a decision should be based on the overall machine performance at a given time, and not the index size. Unless you are talking solely about HD space and not having any performance issues. 2008/6/7 Otis Gospodnetic : Marcus, For that you can rely on du, vmstat, iostat, top and such, too. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou To: solr-user@lucene.apache.org Sent: Saturday, June 7, 2008 12:33:10 PM Subject: Re: Num docs Thanks, I wanna ask the indices how much more each shard can handle before they're considered full and scream for a budget to get a new machine :) /M On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic wrote: Marcus, check out the Luke request handler. You can get it from its output. It may also be possible to get *just* that number, but I'm not looking at docs/code right now to know for sure. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou To: solr-user@lucene.apache.org Sent: Saturday, June 7, 2008 5:09:20 AM Subject: Num docs Hi. Is there a way of retrieve IndexWriter.numDocs() in SOLR ? Kindly //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/ -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/ -- Alexander Ramos Jardim
Re: Num docs
Marcus, check out the Luke request handler. You can get it from its output. It may also be possible to get *just* that number, but I'm not looking at docs/code right now to know for sure. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Saturday, June 7, 2008 5:09:20 AM Subject: Num docs Hi. Is there a way of retrieve IndexWriter.numDocs() in SOLR ? Kindly //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/
Re: Num docs
Thanks, I wanna ask the indices how much more each shard can handle before they're considered full and scream for a budget to get a new machine :) /M On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic [EMAIL PROTECTED] wrote: Marcus, check out the Luke request handler. You can get it from its output. It may also be possible to get *just* that number, but I'm not looking at docs/code right now to know for sure. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Saturday, June 7, 2008 5:09:20 AM Subject: Num docs Hi. Is there a way of retrieve IndexWriter.numDocs() in SOLR ? Kindly //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/ -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/
Re: Num docs
Marcus, For that you can rely on du, vmstat, iostat, top and such, too. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou [EMAIL PROTECTED] To: solr-user@lucene.apache.org Sent: Saturday, June 7, 2008 12:33:10 PM Subject: Re: Num docs Thanks, I wanna ask the indices how much more each shard can handle before they're considered full and scream for a budget to get a new machine :) /M On Sat, Jun 7, 2008 at 3:07 PM, Otis Gospodnetic wrote: Marcus, check out the Luke request handler. You can get it from its output. It may also be possible to get *just* that number, but I'm not looking at docs/code right now to know for sure. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marcus Herou To: solr-user@lucene.apache.org Sent: Saturday, June 7, 2008 5:09:20 AM Subject: Num docs Hi. Is there a way of retrieve IndexWriter.numDocs() in SOLR ? Kindly //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/ -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/
RE: Num docs
This appears in the stats.jsp page. Both the total of document 'slots' and the number of live documents. -Original Message- From: Marcus Herou [mailto:[EMAIL PROTECTED] Sent: Saturday, June 07, 2008 2:09 AM To: solr-user@lucene.apache.org Subject: Num docs Hi. Is there a way of retrieve IndexWriter.numDocs() in SOLR ? Kindly //Marcus -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 [EMAIL PROTECTED] http://www.tailsweep.com/ http://blogg.tailsweep.com/