commit it taking 1300 ms

2016-08-08 Thread Midas A
Hi ,

commit is taking more than 1300 ms . what should i check on server.

below is my configuration .

 ${solr.autoCommit.maxTime:15000} <
openSearcher>false   
${solr.autoSoftCommit.maxTime:-1} 


Re: NoNode error on -downconfig when node does exist?

2016-08-08 Thread John Bickerstaff
LOL!  Thanks.

Oh yeah.  I've done my time in a support role!  Nothing more maddening than
a user who won't share the facts!

On Mon, Aug 8, 2016 at 9:24 PM, Erick Erickson 
wrote:

> BTW, kudos for including the commands in your first problem statement
> even though, I'm sure, you wondered if it was necessary. Saved at least
> three back-and-forths to get to the root of the problem (little pun
> there)...
>
> Erick
>
> On Mon, Aug 8, 2016 at 3:11 PM, John Bickerstaff
>  wrote:
> > OMG!
> >
> > Thanks.  Too long staring at the same string.
> >
> > On Mon, Aug 8, 2016 at 3:49 PM, Kevin Risden 
> > wrote:
> >
> >> Just a quick guess: do you have a period (.) in your zk connection
> string
> >> chroot when you meant an underscore (_)?
> >>
> >> When you do the ls you use /solr6_1/configs, but you have /solr6.1 in
> your
> >> zk connection string chroot.
> >>
> >> Kevin Risden
> >>
> >> On Mon, Aug 8, 2016 at 4:44 PM, John Bickerstaff <
> j...@johnbickerstaff.com
> >> >
> >> wrote:
> >>
> >> > First, the caveat:  I understand this is technically a zookeeper
> error.
> >> It
> >> > is an error that occurs when trying to deal with Solr however, so I'm
> >> > hoping someone on the list may have some insight.  Also, I'm getting
> the
> >> > error via the zkcli.sh tool that comes with Solr...
> >> >
> >> > I have created a collection in SolrCloud (6.1) giving the
> "techproducts"
> >> > sample directory as the location of the conf files.
> >> >
> >> > I then wanted to download those files from zookeeper to the local
> machine
> >> > via the -cmd downconfig command, so I issue this command:
> >> >
> >> > sudo /opt/solr/server/scripts/cloud-scripts/zkcli.sh -cmd downconfig
> >> > -confdir /home/john/conf/ -confname statdx -z 192.168.56.5/solr6.1
> >> >
> >> > Instead of the files, I get a stacktrace / error back which says :
> >> >
> >> > exception in thread "main" java.io.IOException: Error downloading
> files
> >> > from zookeeper path /configs/statdx to /home/john/conf
> >> > at
> >> > org.apache.solr.common.cloud.ZkConfigManager.downloadFromZK(
> >> > ZkConfigManager.java:117)
> >> > at
> >> > org.apache.solr.common.cloud.ZkConfigManager.downloadConfigDir(
> >> > ZkConfigManager.java:153)
> >> > at org.apache.solr.cloud.ZkCLI.main(ZkCLI.java:237)
> >> > *Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> >> > KeeperErrorCode = NoNode for /configs/statdx*
> >> > at org.apache.zookeeper.KeeperException.create(
> KeeperException.java:111)
> >> > at org.apache.zookeeper.KeeperException.create(
> KeeperException.java:51)
> >> > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
> >> > at
> >> > org.apache.solr.common.cloud.SolrZkClient$6.execute(
> >> SolrZkClient.java:331)
> >> > at
> >> > org.apache.solr.common.cloud.SolrZkClient$6.execute(
> >> SolrZkClient.java:328)
> >> > at
> >> > org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(
> >> > ZkCmdExecutor.java:60)
> >> > at
> >> > org.apache.solr.common.cloud.SolrZkClient.getChildren(
> >> > SolrZkClient.java:328)
> >> > at
> >> > org.apache.solr.common.cloud.ZkConfigManager.downloadFromZK(
> >> > ZkConfigManager.java:101)
> >> > ... 2 more
> >> >
> >> > However, when I actually look in Zookeeper, I find that the
> "directory"
> >> > does exist and that inside it are listed all the files.
> >> >
> >> > Here is the output from zookeeper:
> >> >
> >> > [zk: localhost:2181(CONNECTED) 0] *ls /solr6_1/configs*
> >> > [statdx]
> >> >
> >> > and...
> >> >
> >> > [zk: localhost:2181(CONNECTED) 1] *ls /solr6_1/configs/statdx*
> >> > [mapping-FoldToASCII.txt, currency.xml, managed-schema, protwords.txt,
> >> > synonyms.txt, stopwords.txt, _schema_analysis_synonyms_english.json,
> >> > velocity, admin-extra.html, update-script.js,
> >> > _schema_analysis_stopwords_english.json, solrconfig.xml,
> >> > admin-extra.menu-top.html, elevate.xml, clustering, xslt,
> >> > _rest_managed.json, mapping-ISOLatin1Accent.txt, spellings.txt, lang,
> >> > admin-extra.menu-bottom.html]
> >> >
> >> > I've rebooted all my zookeeper nodes and restarted them - just in
> case...
> >> > Same deal.
> >> >
> >> > Has anyone seen anything like this?
> >> >
> >>
>


Solr DeleteByQuery vs DeleteById

2016-08-08 Thread Bharath Kumar
Hi All,

We are using SOLR 6.1 and i wanted to know which is better to use -
deleteById or deleteByQuery?

We have a program which deletes 10 documents every 5 minutes from the
SOLR and we do it in a batch of 200 to delete those documents. For that we
now use deleteById(List ids, 1) to delete.
I wanted to know if we change it to deleteByQuery(query, 1) where the
query is like this - (id:1 OR id:2 OR id:3 OR id:4). Will this have a
performance impact?

We use SOLR cloud with 3 SOLR nodes in the cluster and also we have a
similar setup on the target site and we use Cross Data Center Replication
to replicate from main site.

Can you please let me know if using deleteByQuery will have any impact? I
see it opens real time searcher on all the nodes in cluster.

-- 
Thanks & Regards,
Bharath MV Kumar

"Life is short, enjoy every moment of it"


Re: NoNode error on -downconfig when node does exist?

2016-08-08 Thread Erick Erickson
BTW, kudos for including the commands in your first problem statement
even though, I'm sure, you wondered if it was necessary. Saved at least
three back-and-forths to get to the root of the problem (little pun there)...

Erick

On Mon, Aug 8, 2016 at 3:11 PM, John Bickerstaff
 wrote:
> OMG!
>
> Thanks.  Too long staring at the same string.
>
> On Mon, Aug 8, 2016 at 3:49 PM, Kevin Risden 
> wrote:
>
>> Just a quick guess: do you have a period (.) in your zk connection string
>> chroot when you meant an underscore (_)?
>>
>> When you do the ls you use /solr6_1/configs, but you have /solr6.1 in your
>> zk connection string chroot.
>>
>> Kevin Risden
>>
>> On Mon, Aug 8, 2016 at 4:44 PM, John Bickerstaff > >
>> wrote:
>>
>> > First, the caveat:  I understand this is technically a zookeeper error.
>> It
>> > is an error that occurs when trying to deal with Solr however, so I'm
>> > hoping someone on the list may have some insight.  Also, I'm getting the
>> > error via the zkcli.sh tool that comes with Solr...
>> >
>> > I have created a collection in SolrCloud (6.1) giving the "techproducts"
>> > sample directory as the location of the conf files.
>> >
>> > I then wanted to download those files from zookeeper to the local machine
>> > via the -cmd downconfig command, so I issue this command:
>> >
>> > sudo /opt/solr/server/scripts/cloud-scripts/zkcli.sh -cmd downconfig
>> > -confdir /home/john/conf/ -confname statdx -z 192.168.56.5/solr6.1
>> >
>> > Instead of the files, I get a stacktrace / error back which says :
>> >
>> > exception in thread "main" java.io.IOException: Error downloading files
>> > from zookeeper path /configs/statdx to /home/john/conf
>> > at
>> > org.apache.solr.common.cloud.ZkConfigManager.downloadFromZK(
>> > ZkConfigManager.java:117)
>> > at
>> > org.apache.solr.common.cloud.ZkConfigManager.downloadConfigDir(
>> > ZkConfigManager.java:153)
>> > at org.apache.solr.cloud.ZkCLI.main(ZkCLI.java:237)
>> > *Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
>> > KeeperErrorCode = NoNode for /configs/statdx*
>> > at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>> > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>> > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
>> > at
>> > org.apache.solr.common.cloud.SolrZkClient$6.execute(
>> SolrZkClient.java:331)
>> > at
>> > org.apache.solr.common.cloud.SolrZkClient$6.execute(
>> SolrZkClient.java:328)
>> > at
>> > org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(
>> > ZkCmdExecutor.java:60)
>> > at
>> > org.apache.solr.common.cloud.SolrZkClient.getChildren(
>> > SolrZkClient.java:328)
>> > at
>> > org.apache.solr.common.cloud.ZkConfigManager.downloadFromZK(
>> > ZkConfigManager.java:101)
>> > ... 2 more
>> >
>> > However, when I actually look in Zookeeper, I find that the "directory"
>> > does exist and that inside it are listed all the files.
>> >
>> > Here is the output from zookeeper:
>> >
>> > [zk: localhost:2181(CONNECTED) 0] *ls /solr6_1/configs*
>> > [statdx]
>> >
>> > and...
>> >
>> > [zk: localhost:2181(CONNECTED) 1] *ls /solr6_1/configs/statdx*
>> > [mapping-FoldToASCII.txt, currency.xml, managed-schema, protwords.txt,
>> > synonyms.txt, stopwords.txt, _schema_analysis_synonyms_english.json,
>> > velocity, admin-extra.html, update-script.js,
>> > _schema_analysis_stopwords_english.json, solrconfig.xml,
>> > admin-extra.menu-top.html, elevate.xml, clustering, xslt,
>> > _rest_managed.json, mapping-ISOLatin1Accent.txt, spellings.txt, lang,
>> > admin-extra.menu-bottom.html]
>> >
>> > I've rebooted all my zookeeper nodes and restarted them - just in case...
>> > Same deal.
>> >
>> > Has anyone seen anything like this?
>> >
>>


Re: NoNode error on -downconfig when node does exist?

2016-08-08 Thread John Bickerstaff
OMG!

Thanks.  Too long staring at the same string.

On Mon, Aug 8, 2016 at 3:49 PM, Kevin Risden 
wrote:

> Just a quick guess: do you have a period (.) in your zk connection string
> chroot when you meant an underscore (_)?
>
> When you do the ls you use /solr6_1/configs, but you have /solr6.1 in your
> zk connection string chroot.
>
> Kevin Risden
>
> On Mon, Aug 8, 2016 at 4:44 PM, John Bickerstaff  >
> wrote:
>
> > First, the caveat:  I understand this is technically a zookeeper error.
> It
> > is an error that occurs when trying to deal with Solr however, so I'm
> > hoping someone on the list may have some insight.  Also, I'm getting the
> > error via the zkcli.sh tool that comes with Solr...
> >
> > I have created a collection in SolrCloud (6.1) giving the "techproducts"
> > sample directory as the location of the conf files.
> >
> > I then wanted to download those files from zookeeper to the local machine
> > via the -cmd downconfig command, so I issue this command:
> >
> > sudo /opt/solr/server/scripts/cloud-scripts/zkcli.sh -cmd downconfig
> > -confdir /home/john/conf/ -confname statdx -z 192.168.56.5/solr6.1
> >
> > Instead of the files, I get a stacktrace / error back which says :
> >
> > exception in thread "main" java.io.IOException: Error downloading files
> > from zookeeper path /configs/statdx to /home/john/conf
> > at
> > org.apache.solr.common.cloud.ZkConfigManager.downloadFromZK(
> > ZkConfigManager.java:117)
> > at
> > org.apache.solr.common.cloud.ZkConfigManager.downloadConfigDir(
> > ZkConfigManager.java:153)
> > at org.apache.solr.cloud.ZkCLI.main(ZkCLI.java:237)
> > *Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> > KeeperErrorCode = NoNode for /configs/statdx*
> > at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
> > at
> > org.apache.solr.common.cloud.SolrZkClient$6.execute(
> SolrZkClient.java:331)
> > at
> > org.apache.solr.common.cloud.SolrZkClient$6.execute(
> SolrZkClient.java:328)
> > at
> > org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(
> > ZkCmdExecutor.java:60)
> > at
> > org.apache.solr.common.cloud.SolrZkClient.getChildren(
> > SolrZkClient.java:328)
> > at
> > org.apache.solr.common.cloud.ZkConfigManager.downloadFromZK(
> > ZkConfigManager.java:101)
> > ... 2 more
> >
> > However, when I actually look in Zookeeper, I find that the "directory"
> > does exist and that inside it are listed all the files.
> >
> > Here is the output from zookeeper:
> >
> > [zk: localhost:2181(CONNECTED) 0] *ls /solr6_1/configs*
> > [statdx]
> >
> > and...
> >
> > [zk: localhost:2181(CONNECTED) 1] *ls /solr6_1/configs/statdx*
> > [mapping-FoldToASCII.txt, currency.xml, managed-schema, protwords.txt,
> > synonyms.txt, stopwords.txt, _schema_analysis_synonyms_english.json,
> > velocity, admin-extra.html, update-script.js,
> > _schema_analysis_stopwords_english.json, solrconfig.xml,
> > admin-extra.menu-top.html, elevate.xml, clustering, xslt,
> > _rest_managed.json, mapping-ISOLatin1Accent.txt, spellings.txt, lang,
> > admin-extra.menu-bottom.html]
> >
> > I've rebooted all my zookeeper nodes and restarted them - just in case...
> > Same deal.
> >
> > Has anyone seen anything like this?
> >
>


Re: NoNode error on -downconfig when node does exist?

2016-08-08 Thread Kevin Risden
Just a quick guess: do you have a period (.) in your zk connection string
chroot when you meant an underscore (_)?

When you do the ls you use /solr6_1/configs, but you have /solr6.1 in your
zk connection string chroot.

Kevin Risden

On Mon, Aug 8, 2016 at 4:44 PM, John Bickerstaff 
wrote:

> First, the caveat:  I understand this is technically a zookeeper error.  It
> is an error that occurs when trying to deal with Solr however, so I'm
> hoping someone on the list may have some insight.  Also, I'm getting the
> error via the zkcli.sh tool that comes with Solr...
>
> I have created a collection in SolrCloud (6.1) giving the "techproducts"
> sample directory as the location of the conf files.
>
> I then wanted to download those files from zookeeper to the local machine
> via the -cmd downconfig command, so I issue this command:
>
> sudo /opt/solr/server/scripts/cloud-scripts/zkcli.sh -cmd downconfig
> -confdir /home/john/conf/ -confname statdx -z 192.168.56.5/solr6.1
>
> Instead of the files, I get a stacktrace / error back which says :
>
> exception in thread "main" java.io.IOException: Error downloading files
> from zookeeper path /configs/statdx to /home/john/conf
> at
> org.apache.solr.common.cloud.ZkConfigManager.downloadFromZK(
> ZkConfigManager.java:117)
> at
> org.apache.solr.common.cloud.ZkConfigManager.downloadConfigDir(
> ZkConfigManager.java:153)
> at org.apache.solr.cloud.ZkCLI.main(ZkCLI.java:237)
> *Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
> KeeperErrorCode = NoNode for /configs/statdx*
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
> at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
> at
> org.apache.solr.common.cloud.SolrZkClient$6.execute(SolrZkClient.java:331)
> at
> org.apache.solr.common.cloud.SolrZkClient$6.execute(SolrZkClient.java:328)
> at
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(
> ZkCmdExecutor.java:60)
> at
> org.apache.solr.common.cloud.SolrZkClient.getChildren(
> SolrZkClient.java:328)
> at
> org.apache.solr.common.cloud.ZkConfigManager.downloadFromZK(
> ZkConfigManager.java:101)
> ... 2 more
>
> However, when I actually look in Zookeeper, I find that the "directory"
> does exist and that inside it are listed all the files.
>
> Here is the output from zookeeper:
>
> [zk: localhost:2181(CONNECTED) 0] *ls /solr6_1/configs*
> [statdx]
>
> and...
>
> [zk: localhost:2181(CONNECTED) 1] *ls /solr6_1/configs/statdx*
> [mapping-FoldToASCII.txt, currency.xml, managed-schema, protwords.txt,
> synonyms.txt, stopwords.txt, _schema_analysis_synonyms_english.json,
> velocity, admin-extra.html, update-script.js,
> _schema_analysis_stopwords_english.json, solrconfig.xml,
> admin-extra.menu-top.html, elevate.xml, clustering, xslt,
> _rest_managed.json, mapping-ISOLatin1Accent.txt, spellings.txt, lang,
> admin-extra.menu-bottom.html]
>
> I've rebooted all my zookeeper nodes and restarted them - just in case...
> Same deal.
>
> Has anyone seen anything like this?
>


NoNode error on -downconfig when node does exist?

2016-08-08 Thread John Bickerstaff
First, the caveat:  I understand this is technically a zookeeper error.  It
is an error that occurs when trying to deal with Solr however, so I'm
hoping someone on the list may have some insight.  Also, I'm getting the
error via the zkcli.sh tool that comes with Solr...

I have created a collection in SolrCloud (6.1) giving the "techproducts"
sample directory as the location of the conf files.

I then wanted to download those files from zookeeper to the local machine
via the -cmd downconfig command, so I issue this command:

sudo /opt/solr/server/scripts/cloud-scripts/zkcli.sh -cmd downconfig
-confdir /home/john/conf/ -confname statdx -z 192.168.56.5/solr6.1

Instead of the files, I get a stacktrace / error back which says :

exception in thread "main" java.io.IOException: Error downloading files
from zookeeper path /configs/statdx to /home/john/conf
at
org.apache.solr.common.cloud.ZkConfigManager.downloadFromZK(ZkConfigManager.java:117)
at
org.apache.solr.common.cloud.ZkConfigManager.downloadConfigDir(ZkConfigManager.java:153)
at org.apache.solr.cloud.ZkCLI.main(ZkCLI.java:237)
*Caused by: org.apache.zookeeper.KeeperException$NoNodeException:
KeeperErrorCode = NoNode for /configs/statdx*
at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
at
org.apache.solr.common.cloud.SolrZkClient$6.execute(SolrZkClient.java:331)
at
org.apache.solr.common.cloud.SolrZkClient$6.execute(SolrZkClient.java:328)
at
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
at
org.apache.solr.common.cloud.SolrZkClient.getChildren(SolrZkClient.java:328)
at
org.apache.solr.common.cloud.ZkConfigManager.downloadFromZK(ZkConfigManager.java:101)
... 2 more

However, when I actually look in Zookeeper, I find that the "directory"
does exist and that inside it are listed all the files.

Here is the output from zookeeper:

[zk: localhost:2181(CONNECTED) 0] *ls /solr6_1/configs*
[statdx]

and...

[zk: localhost:2181(CONNECTED) 1] *ls /solr6_1/configs/statdx*
[mapping-FoldToASCII.txt, currency.xml, managed-schema, protwords.txt,
synonyms.txt, stopwords.txt, _schema_analysis_synonyms_english.json,
velocity, admin-extra.html, update-script.js,
_schema_analysis_stopwords_english.json, solrconfig.xml,
admin-extra.menu-top.html, elevate.xml, clustering, xslt,
_rest_managed.json, mapping-ISOLatin1Accent.txt, spellings.txt, lang,
admin-extra.menu-bottom.html]

I've rebooted all my zookeeper nodes and restarted them - just in case...
Same deal.

Has anyone seen anything like this?


Re: Can a MergeStrategy filter returned docs?

2016-08-08 Thread tedsolr
Some more info that might be helpful. If I can trust my logging this is
what's happening (search with rows=3 on collection with 2 shards):

1) delegating collector finish() method places custom data on request object
for _shard 1_
2) doc transformer transform() method is called for 3 requested docs
3) delegating collector finish() method places custom data on request object
for _shard 2_
4) doc transformer transform() method is called for 3 requested docs
5) merge strategy merge() method is called: documents for both shards are
there
6) doc transformer transform() method is called again (?) - twice for same
docid - possibly from different shards
7) boom - EOFException thrown

Caused by: java.io.EOFException
at
org.apache.solr.common.util.FastInputStream.readByte(FastInputStream.java:208)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:188)
at
org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:508)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:202)
at
org.apache.solr.common.util.JavaBinCodec.readSolrDocumentList(JavaBinCodec.java:390)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:237)
at
org.apache.solr.common.util.JavaBinCodec.readOrderedMap(JavaBinCodec.java:135)
at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:204)
at
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:126)
at
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:50)
... 15 more



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-a-MergeStrategy-filter-returned-docs-tp4290446p4290825.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Solr Cloud with 5 servers cluster failed due to Leader out of memory

2016-08-08 Thread Ritesh Kumar (Avanade)
This is great but where can I do this change in SOLR 6 as I have implemented 
CDCR.

Ritesh K
Infrastructure Sr. Engineer – Jericho Team
Sales & Marketing Digital Services
t +91-7799936921   v-kur...@microsoft.com

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: 08 August 2016 21:30
To: solr-user 
Subject: Re: Solr Cloud with 5 servers cluster failed due to Leader out of 
memory

Yeah, Shawn, but you, like, know something about Tomcat and actually provide 
useful advice ;)

On Mon, Aug 8, 2016 at 6:44 AM, Shawn Heisey  wrote:
> On 8/7/2016 6:53 PM, Tim Chen wrote:
>> Exception in thread "http-bio-8983-exec-6571" java.lang.OutOfMemoryError: 
>> unable to create new native thread
>> at java.lang.Thread.start0(Native Method)
>> at java.lang.Thread.start(Thread.java:714)
>> at 
>> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
>> at 
>> java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1017)
>> at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1163)
>> at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at 
>> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
>> at java.lang.Thread.run(Thread.java:745)
>
> I find myself chasing Erick once again. :)  Supplementing what he told you:
>
> There are two things that might be happening here.
>
> 1) The Tomcat setting "maxThreads" may limiting the number of threads.
> This defaults to 200, and should be increased to 1.  The specific 
> error doesn't sound like an application limit, though -- it acts more 
> like Java itself can't create the thread.  If you have already 
> adjusted maxThreads, then it's more likely to be the second option:
>
> 2) The operating system may be imposing a limit on the number of 
> processes/threads a user is allowed to start.  On Linux systems, this 
> is typically 1024.  For other operating systems, I am not sure what 
> the default limit is.
>
> Thanks,
> Shawn
>


Re: Can a MergeStrategy filter returned docs?

2016-08-08 Thread tedsolr
That makes sense. I would prefer to just merge the custom analytics, but
sending that much info via the solr response seems very slow. However I
still can't figure out how to access the custom analytics in a doc
transformer. That would provide the fastest response but I would have to
merge the Ids myself. I think I have only two paths, one appears to be too
slow, the other just throws exceptions.

The slow approach:
- The delegating collector computes the analytics for each collected doc: {
docId, { ... }}
- From the finish() method it places that map (size could be million+
elements) on the solr response: (response builder).rsp.add("customStats",
obj)
- The merge strategy gets the analytics from each shard response, merges
them only for the docs returned to the caller, then adds them to the solr
query response (size is now thousands, not millions).

This would work, but it's really slow. Does that have to do with putting the
analytics on the solr response for the merge object to pick up?

The broken approach (only works for single shard):
- The delegating collector computes the analytics for each collected doc
(exactly the same as above)
- From the finish() method it places that map (size could be million+
elements) on the solr query request: (response
builder).req.getContext().put("customStats", obj)
- Doc transformer reads the analytics and adds a field to the doc containing
the stats for that one field (the analytics are injected into the returned
doc)
- The merge strategy combines the analytics of duplicate docs. 

When the doc transformer first tries to read the analytics for the second
shard it throws exceptions. So either this approach is not possible, or my
implementation is flawed. You may not be able to determine anything from a
small code snippet, but this is my transform method:

public void transform(SolrDocument doc, int id) throws IOException {
if (super.context != null) {
HashMap stats = (HashMap)
super.context.req.getContext().get("CustomAnalytics");

HashMap fieldStats = stats.get(id);
if (fieldStats != null) {
doc.setField(field, fieldStats.print());
}
}
}

Any idea why the latter approach is not working?

Joel Bernstein wrote
> The mergeIds() method should be true if you are handling the merge of the
> documents from the shards. If you are merging custom analytics from an
> AnalyticsQuery only then you would return false. In your case, since you
> are de-duping documents you would need to return true.





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Can-a-MergeStrategy-filter-returned-docs-tp4290446p4290799.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Should we still optimize?

2016-08-08 Thread Yonik Seeley
On Mon, Aug 8, 2016 at 5:10 AM, Callum Lamb  wrote:
> We have a cronjob that runs every week at a quiet time to run the
> optimizecommand on our Solr collections. Even when it's quiet it's still an
> extremely heavy operation.
>
> One of the things I keep seeing on stackoverflow is that optimizing is now
> essentially deprecated and lucene (We're on Solr 5.5.2) will now keep the
> amount of segments at a reasonable level and that the performance impact of
> having deletedDocs is now much less.

Optimize is certainly not deprecated.
The operation was renamed to forceMerge at the Lucene level (but not
the Solr level) due to concerns that people may think it was necessary
for good performance and didn't realize the cost.

> One of our cores doesn't get optimized and it's currently sitting at 5.5
> million documents with 1.9 million deleted docs. Which seems pretty high to
> me.
>
> How true is this claim? Is optimizing still a good idea for the general
> case?

The cost of optimize will always be high (but the impact of that cost
depends on the user/use case).  The benefit may be small to large.
I don't think one can really give a recommendation for the general case.

-Yonik


Re: Should we still optimize?

2016-08-08 Thread Walter Underwood
Did you change the merge settings and max segments? If you did, try going back 
to the defaults.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)


> On Aug 8, 2016, at 8:56 AM, Erick Erickson  wrote:
> 
> Callum:
> 
> re: the optimize failing: Perhaps it's just timing out?
> That is, the command succeeds fine (which you
> are reporting), but it's taking long enough that the
> request times out so the client you're using reports an error.
> Just a guess...
> 
> My personal feeling is that (of course), you need to measure
> your perf before/after optimize to see if there's a measurable
> difference. Apart from that, Shawn's comments about the
> stats being different due to deleted docs is germane.
> 
> Have you tried adding expundeDeletes=true to a commit
> message? See:
> https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers
> 
> A little-known option is to control how aggressively
> the % of deleted documents is factored in to the decision
> whether to merge a segments or not. It takes a little
> code-diving, and faith, but if you look at TieredMergePolicy,
> you'll see a double field: reclaimDeletesWeight.
> 
> Now, in your solrconfig.xml file you can set this, there's a
> clever bit of reflection to allow these to be specified, going
> from memory it's just
> 3.0
> as a node in your tiered merge config. The default is 2.0.
> In terms of what that _does_, that's where code-diving
> comes in.
> 
> Best,
> Erick
> 
> On Mon, Aug 8, 2016 at 7:59 AM, Callum Lamb  wrote:
>> Yeah I figured that was too many deleteddocs. It could just be that our max
>> segments is set too high though.
>> 
>> The reason I asked is because our optimize requests have started failing.
>> Or at least,they are appearing to fail because the optimize request returns
>> a non 200. The optimize seems to go ahead successfully regardless though.
>> Before trying to find out if I can  asynchronously request and poll for
>> success (doesn't appear to be possible yet) or a better way of determining
>> success, I thought I'd check if the whole thing was necessary to begin with.
>> 
>> Hopefully it doesn't involve polling the core status until deleteddocs goes
>> below a certain level :/.
>> 
>> Cheers for info.
>> 
>> On Mon, Aug 8, 2016 at 2:58 PM, Shawn Heisey  wrote:
>> 
>>> On 8/8/2016 3:10 AM, Callum Lamb wrote:
 How true is this claim? Is optimizing still a good idea for the
 general case?
>>> 
>>> For the general case, optimizing is not recommended.  If there are a
>>> very large number of deleted documents, which does describe your
>>> situation, then there is definitely a benefit.
>>> 
>>> In cases where there are a lot of deleted documents, scoring can be
>>> affected by the presence of the deleted documents, and the drop in index
>>> size after an optimize can result in a large performance boost.  For the
>>> general case where there are not many deletes, there *is* a performance
>>> benefit to optimizing down to a single segment, but it is nowhere near
>>> as dramatic as it was in the 1.x/3.x days.
>>> 
>>> The problem with optimizes in the general case is this:  The performance
>>> hit that the optimize operation itself causes may not be worth the small
>>> performance improvement.
>>> 
>>> If you have a time where your index is quiet enough that the optimize
>>> itself won't be disruptive, then you should certainly take advantage of
>>> that time and do the optimize, even if there aren't many deletes.
>>> 
>>> There is another benefit to optimizes that doesn't get mentioned often:
>>> It can make subsequent normal merging operations during indexing faster,
>>> because there will not be as many large segments.
>>> 
>>> Thanks,
>>> Shawn
>>> 
>>> 
>> 
>> --
>> 
>> Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN
>> Registered in England: Number 1475918. | VAT Number: GB 232 9342 72
>> 
>> Contact details for our other offices can be found at
>> http://www.mintel.com/office-locations.
>> 
>> This email and any attachments may include content that is confidential,
>> privileged
>> or otherwise protected under applicable law. Unauthorised disclosure,
>> copying, distribution
>> or use of the contents is prohibited and may be unlawful. If you have
>> received this email in error,
>> including without appropriate authorisation, then please reply to the
>> sender about the error
>> and delete this email and any attachments.
>> 



Re: Solr Cloud with 5 servers cluster failed due to Leader out of memory

2016-08-08 Thread Erick Erickson
Yeah, Shawn, but you, like, know something about Tomcat and
actually provide useful advice ;)

On Mon, Aug 8, 2016 at 6:44 AM, Shawn Heisey  wrote:
> On 8/7/2016 6:53 PM, Tim Chen wrote:
>> Exception in thread "http-bio-8983-exec-6571" java.lang.OutOfMemoryError: 
>> unable to create new native thread
>> at java.lang.Thread.start0(Native Method)
>> at java.lang.Thread.start(Thread.java:714)
>> at 
>> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
>> at 
>> java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1017)
>> at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1163)
>> at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at 
>> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
>> at java.lang.Thread.run(Thread.java:745)
>
> I find myself chasing Erick once again. :)  Supplementing what he told you:
>
> There are two things that might be happening here.
>
> 1) The Tomcat setting "maxThreads" may limiting the number of threads.
> This defaults to 200, and should be increased to 1.  The specific
> error doesn't sound like an application limit, though -- it acts more
> like Java itself can't create the thread.  If you have already adjusted
> maxThreads, then it's more likely to be the second option:
>
> 2) The operating system may be imposing a limit on the number of
> processes/threads a user is allowed to start.  On Linux systems, this is
> typically 1024.  For other operating systems, I am not sure what the
> default limit is.
>
> Thanks,
> Shawn
>


Re: query problem

2016-08-08 Thread Erick Erickson
If at all possible, denormalize the data

But you can also use Solr's Join capability here, see:
https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-JoinQueryParser

Best,
Erick

On Mon, Aug 8, 2016 at 8:47 AM, Pithon Philippe  wrote:
> Hello,
> I have two documents type :
> - tickets (type_s:"ticket", customerid_i:10)
> - customers (type_s:customer,customerid_i:10,name_s:"FISHER" )
>
> I want a query to find all tickets for name customer FISHER
> In document ticket (type_s:"ticket") , I have id customer but not name
> customer...
>
> Any ideas ???
>
> Thanks


Re: Should we still optimize?

2016-08-08 Thread Erick Erickson
Callum:

re: the optimize failing: Perhaps it's just timing out?
That is, the command succeeds fine (which you
are reporting), but it's taking long enough that the
request times out so the client you're using reports an error.
Just a guess...

My personal feeling is that (of course), you need to measure
your perf before/after optimize to see if there's a measurable
difference. Apart from that, Shawn's comments about the
stats being different due to deleted docs is germane.

Have you tried adding expundeDeletes=true to a commit
message? See:
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers

A little-known option is to control how aggressively
the % of deleted documents is factored in to the decision
whether to merge a segments or not. It takes a little
code-diving, and faith, but if you look at TieredMergePolicy,
you'll see a double field: reclaimDeletesWeight.

Now, in your solrconfig.xml file you can set this, there's a
clever bit of reflection to allow these to be specified, going
from memory it's just
3.0
as a node in your tiered merge config. The default is 2.0.
In terms of what that _does_, that's where code-diving
comes in.

Best,
Erick

On Mon, Aug 8, 2016 at 7:59 AM, Callum Lamb  wrote:
> Yeah I figured that was too many deleteddocs. It could just be that our max
> segments is set too high though.
>
> The reason I asked is because our optimize requests have started failing.
> Or at least,they are appearing to fail because the optimize request returns
> a non 200. The optimize seems to go ahead successfully regardless though.
> Before trying to find out if I can  asynchronously request and poll for
> success (doesn't appear to be possible yet) or a better way of determining
> success, I thought I'd check if the whole thing was necessary to begin with.
>
> Hopefully it doesn't involve polling the core status until deleteddocs goes
> below a certain level :/.
>
> Cheers for info.
>
> On Mon, Aug 8, 2016 at 2:58 PM, Shawn Heisey  wrote:
>
>> On 8/8/2016 3:10 AM, Callum Lamb wrote:
>> > How true is this claim? Is optimizing still a good idea for the
>> > general case?
>>
>> For the general case, optimizing is not recommended.  If there are a
>> very large number of deleted documents, which does describe your
>> situation, then there is definitely a benefit.
>>
>> In cases where there are a lot of deleted documents, scoring can be
>> affected by the presence of the deleted documents, and the drop in index
>> size after an optimize can result in a large performance boost.  For the
>> general case where there are not many deletes, there *is* a performance
>> benefit to optimizing down to a single segment, but it is nowhere near
>> as dramatic as it was in the 1.x/3.x days.
>>
>> The problem with optimizes in the general case is this:  The performance
>> hit that the optimize operation itself causes may not be worth the small
>> performance improvement.
>>
>> If you have a time where your index is quiet enough that the optimize
>> itself won't be disruptive, then you should certainly take advantage of
>> that time and do the optimize, even if there aren't many deletes.
>>
>> There is another benefit to optimizes that doesn't get mentioned often:
>> It can make subsequent normal merging operations during indexing faster,
>> because there will not be as many large segments.
>>
>> Thanks,
>> Shawn
>>
>>
>
> --
>
> Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN
> Registered in England: Number 1475918. | VAT Number: GB 232 9342 72
>
> Contact details for our other offices can be found at
> http://www.mintel.com/office-locations.
>
> This email and any attachments may include content that is confidential,
> privileged
> or otherwise protected under applicable law. Unauthorised disclosure,
> copying, distribution
> or use of the contents is prohibited and may be unlawful. If you have
> received this email in error,
> including without appropriate authorisation, then please reply to the
> sender about the error
> and delete this email and any attachments.
>


query problem

2016-08-08 Thread Pithon Philippe
Hello,
I have two documents type :
- tickets (type_s:"ticket", customerid_i:10)
- customers (type_s:customer,customerid_i:10,name_s:"FISHER" )

I want a query to find all tickets for name customer FISHER
In document ticket (type_s:"ticket") , I have id customer but not name
customer...

Any ideas ???

Thanks


Re: Should we still optimize?

2016-08-08 Thread Callum Lamb
Yeah I figured that was too many deleteddocs. It could just be that our max
segments is set too high though.

The reason I asked is because our optimize requests have started failing.
Or at least,they are appearing to fail because the optimize request returns
a non 200. The optimize seems to go ahead successfully regardless though.
Before trying to find out if I can  asynchronously request and poll for
success (doesn't appear to be possible yet) or a better way of determining
success, I thought I'd check if the whole thing was necessary to begin with.

Hopefully it doesn't involve polling the core status until deleteddocs goes
below a certain level :/.

Cheers for info.

On Mon, Aug 8, 2016 at 2:58 PM, Shawn Heisey  wrote:

> On 8/8/2016 3:10 AM, Callum Lamb wrote:
> > How true is this claim? Is optimizing still a good idea for the
> > general case?
>
> For the general case, optimizing is not recommended.  If there are a
> very large number of deleted documents, which does describe your
> situation, then there is definitely a benefit.
>
> In cases where there are a lot of deleted documents, scoring can be
> affected by the presence of the deleted documents, and the drop in index
> size after an optimize can result in a large performance boost.  For the
> general case where there are not many deletes, there *is* a performance
> benefit to optimizing down to a single segment, but it is nowhere near
> as dramatic as it was in the 1.x/3.x days.
>
> The problem with optimizes in the general case is this:  The performance
> hit that the optimize operation itself causes may not be worth the small
> performance improvement.
>
> If you have a time where your index is quiet enough that the optimize
> itself won't be disruptive, then you should certainly take advantage of
> that time and do the optimize, even if there aren't many deletes.
>
> There is another benefit to optimizes that doesn't get mentioned often:
> It can make subsequent normal merging operations during indexing faster,
> because there will not be as many large segments.
>
> Thanks,
> Shawn
>
>

-- 

Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN
Registered in England: Number 1475918. | VAT Number: GB 232 9342 72

Contact details for our other offices can be found at 
http://www.mintel.com/office-locations.

This email and any attachments may include content that is confidential, 
privileged 
or otherwise protected under applicable law. Unauthorised disclosure, 
copying, distribution 
or use of the contents is prohibited and may be unlawful. If you have 
received this email in error,
including without appropriate authorisation, then please reply to the 
sender about the error 
and delete this email and any attachments.



Re: problems with bulk indexing with concurrent DIH

2016-08-08 Thread Shawn Heisey
On 8/2/2016 7:50 AM, Bernd Fehling wrote:
> Only assumption so far, DIH is sending the records as "update" (and
> not pure "add") to the indexer which will generate delete files during
> merge. If the number of segments is high it will take quite long to
> merge and check all records of all segments.

It's not DIH that's handling the requests as "update", it's Solr.  If
you index a document with the same value in the uniqueKey field as a
document that already exists in the index, Solr will delete the old one
before it adds the new one.  This applies to ANY indexing, not just
DIH.  This is how Solr is designed to work -- that's the entire point of
having a uniqueKey.

I'm not familiar with how a large number of deletes affects merging.  I
would not expect it to have much of a performance impact, and it might
in fact make merging faster, because I'd think that deleted docs would
be skipped.

Turning overwrite off when you are indexing would mean that Solr's
uniqueKey guarantee is lost.  You can end up with duplicate documents in
the Lucene index, and because merging can completely change internal
identifiers, there may be no built-in way for Solr or Lucene to
automatically determine which ones are old or new.

I didn't know about LUCENE-6161.  That looks like a nasty bug.

Thanks,
Shawn



Re: Should we still optimize?

2016-08-08 Thread Shawn Heisey
On 8/8/2016 3:10 AM, Callum Lamb wrote:
> How true is this claim? Is optimizing still a good idea for the
> general case?

For the general case, optimizing is not recommended.  If there are a
very large number of deleted documents, which does describe your
situation, then there is definitely a benefit.

In cases where there are a lot of deleted documents, scoring can be
affected by the presence of the deleted documents, and the drop in index
size after an optimize can result in a large performance boost.  For the
general case where there are not many deletes, there *is* a performance
benefit to optimizing down to a single segment, but it is nowhere near
as dramatic as it was in the 1.x/3.x days.

The problem with optimizes in the general case is this:  The performance
hit that the optimize operation itself causes may not be worth the small
performance improvement.

If you have a time where your index is quiet enough that the optimize
itself won't be disruptive, then you should certainly take advantage of
that time and do the optimize, even if there aren't many deletes.

There is another benefit to optimizes that doesn't get mentioned often: 
It can make subsequent normal merging operations during indexing faster,
because there will not be as many large segments.

Thanks,
Shawn



Re: Can a MergeStrategy filter returned docs?

2016-08-08 Thread Joel Bernstein
The mergeIds() method should be true if you are handling the merge of the
documents from the shards. If you are merging custom analytics from an
AnalyticsQuery only then you would return false. In your case, since you
are de-duping documents you would need to return true.

There are two methods in the MergeStrategy that you need to implement if
you need to merge docs based on fields other then the sort fields.

This is tricky to implement though and requires significant understanding
of Solr's internals. The method: QueryComponent.doFieldSortValues

shows how Solr sends the sort fields to the aggregator node. You can send
your own merge fields to the aggregator by implementing methods in the
MergeStrategy.

The methods you need to implement in the MergeStrategy if you need custom
fields to do the merge are below:

* handlesMergeFields must return true if the MergeStrategy
* implements a custom handleMergeFields(ResponseBuilder rb,
SolrIndexSearch searcher)
* */

public boolean handlesMergeFields();


/**
*  Implement handleMergeFields(ResponseBuilder rb, SolrIndexSearch searcher) if
*  your merge strategy needs more complex data then the sort fields provide.
* */

public void handleMergeFields(ResponseBuilder rb, SolrIndexSearcher
searcher) throws IOException;


Joel Bernstein
http://joelsolr.blogspot.com/

On Fri, Aug 5, 2016 at 10:56 AM, tedsolr  wrote:

> I don't see any field level data exposed in the SolrDocumentList I get from
> shardResponse.getSolrResponse().getResponse().get("response"). I see the
> unique ID field and value. Is that by design or am I being stupid?
>
> Separate but related question: the mergIds() method in the merge strategy
> class - when TRUE the developer is taking responsibility for the document
> merge, when FALSE it looks like the QueryComponent puts all the results in
> a
> sorted queue and removes the "extras" - right? When rows=3 each shard
> returns 3 docs, but the user only wants 3 total not 3 per shard. So, if I
> set mergeIds=FALSE I won't have to resort the docs, just eliminate the
> dupes
> somehow.
>
>
> Joel Bernstein wrote
> > Collapse will have dups unless you use the _route_ parameter to co-locate
> > documents with the same group, onto the same shard.
> >
> > In you're scenario, co-locating docs sounds like it won't work because
> you
> > may have different grouping criteria.
> >
> > The doc counts would be inflated unless you sent all the documents from
> > the
> > shards to be merged and then de-duped them, which is how streaming
> > operates. But streaming has the capability to do these types of
> operations
> > in parallel and the merge strategy does not.
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Can-a-
> MergeStrategy-filter-returned-docs-tp4290446p4290556.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Solr Cloud with 5 servers cluster failed due to Leader out of memory

2016-08-08 Thread Shawn Heisey
On 8/7/2016 6:53 PM, Tim Chen wrote:
> Exception in thread "http-bio-8983-exec-6571" java.lang.OutOfMemoryError: 
> unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
> at 
> java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1017)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1163)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at 
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
> at java.lang.Thread.run(Thread.java:745)

I find myself chasing Erick once again. :)  Supplementing what he told you:

There are two things that might be happening here.

1) The Tomcat setting "maxThreads" may limiting the number of threads. 
This defaults to 200, and should be increased to 1.  The specific
error doesn't sound like an application limit, though -- it acts more
like Java itself can't create the thread.  If you have already adjusted
maxThreads, then it's more likely to be the second option:

2) The operating system may be imposing a limit on the number of
processes/threads a user is allowed to start.  On Linux systems, this is
typically 1024.  For other operating systems, I am not sure what the
default limit is.

Thanks,
Shawn



Solr 5.4.1 Master/Slave Replication

2016-08-08 Thread Kalpana
Hello

I have 14 cores, with a couple of them using Shards and now I am looking at
the master/Slave fallback solution. Can anyone please point me in the right
direction to get started?

Thanks
Kalpana



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-5-4-1-Master-Slave-Replication-tp4290763.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Group and sum in SOLR 5.3

2016-08-08 Thread andreap21
Hi Pablo, will try this.

Sorry for the late reply but I didn't get any notification of this answer!

Thanks,
Andrea



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Group-and-sum-in-SOLR-5-3-tp4289556p4290750.html
Sent from the Solr - User mailing list archive at Nabble.com.


Should we still optimize?

2016-08-08 Thread Callum Lamb
We have a cronjob that runs every week at a quiet time to run the
optimizecommand on our Solr collections. Even when it's quiet it's still an
extremely heavy operation.

One of the things I keep seeing on stackoverflow is that optimizing is now
essentially deprecated and lucene (We're on Solr 5.5.2) will now keep the
amount of segments at a reasonable level and that the performance impact of
having deletedDocs is now much less.

One of our cores doesn't get optimized and it's currently sitting at 5.5
million documents with 1.9 million deleted docs. Which seems pretty high to
me.

How true is this claim? Is optimizing still a good idea for the general
case?

-- 

Mintel Group Ltd | 11 Pilgrim Street | London | EC4V 6RN
Registered in England: Number 1475918. | VAT Number: GB 232 9342 72

Contact details for our other offices can be found at 
http://www.mintel.com/office-locations.

This email and any attachments may include content that is confidential, 
privileged 
or otherwise protected under applicable law. Unauthorised disclosure, 
copying, distribution 
or use of the contents is prohibited and may be unlawful. If you have 
received this email in error,
including without appropriate authorisation, then please reply to the 
sender about the error 
and delete this email and any attachments.