Re: Solr Hangs During Updates for over 10 minutes

2013-07-10 Thread Jed Glazner
We are planning an upgrade to 4.4 but it's still weeks out. We offer a
high availability search service and there are a number of changes in 4.4
that are not backward compatible. (i.e. Clusterstate.json and no solr.xml)
So there must be lots of testing, additionally this upgrade cannot be
performed without downtime.

Regardless, I need to find a band-aid right now.  Does anyone know if it's
possible to set the timeout for distributed update request to/from leader.
 Currently we see it's set to 0.  Maybe via -D startup param, or something?

Jed

On 7/10/13 1:23 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote:

Hi Jed,

This is really with Solr 4.0?  If so, it may be wiser to jump on 4.4
that is about to be released.  We did not have fun working with 4.0 in
SolrCloud mode a few months ago.  You will save time, hair, and money
if you convince your manager to let you use Solr 4.4. :)

Otis
--
Solr  ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm



On Tue, Jul 9, 2013 at 4:44 PM, Jed Glazner jglaz...@adobe.com wrote:
 Hi Shawn,

 I have been trying to duplicate this problem without success for the
last 2 weeks which is one reason I'm getting flustered.   It seems
reasonable to be able to duplicate it but I can't.

  We do have a story to upgrade but that is still weeks if not months
before that gets rolled out to production.

 We have another cluster running the same version but with 8 shards and
8 replicas with each shard at 100gb and more load and more indexing
requests without this problem but we send docs in batches here and all
fields are stored.   Where as the trouble index has only 1 or 2 stored
fields and only send docs 1 at a time.

 Could that have anything to do with it?

 Jed


 Von Samsung Mobile gesendet



  Ursprüngliche Nachricht 
 Von: Shawn Heisey s...@elyograg.org
 Datum: 07.09.2013 18:33 (GMT+01:00)
 An: solr-user@lucene.apache.org
 Betreff: Re: Solr Hangs During Updates for over 10 minutes


 On 7/9/2013 9:50 AM, Jed Glazner wrote:
 I'll give you the high level before delving deep into setup etc. I
have been struggeling at work with a seemingly random problem when solr
will hang for 10-15 minutes during updates.  This outage always seems
to immediately be proceeded by an EOF exception on  the replica.  Then
10-15 minutes later we see an exception on the leader for a socket
timeout to the replica.  The leader will then tell the replica to
recover which in most cases it does and then the outage is over.

 Here are the setup details:

 We are currently using Solr 4.0.0 with an external ZK ensemble of 5
machines.

 After 4.0.0 was released, a *lot* of problems with SolrCloud surfaced
 and have since been fixed.  You're five releases and about nine months
 behind what's current.  My recommendation: Upgrade to 4.3.1, ensure your
 configuration is up to date with changes to the example config between
 4.0.0 and 4.3.1, and reindex.  Ideally, you should set up a 4.0.0
 testbed, duplicate your current problem, and upgrade the testbed to see
 if the problem goes away.  A testbed will also give you practice for a
 smooth upgrade of your production system.

 Thanks,
 Shawn




Re: Solr Hangs During Updates for over 10 minutes

2013-07-10 Thread Jed Glazner
Hey Daniel,

Thanks for the response.  I think we'll give this a try to see if this
helps.

Jed.

On 7/10/13 10:48 AM, Daniel Collins danwcoll...@gmail.com wrote:

We had something similar in terms of update times suddenly spiking up for
no obvious reason.  We never got quite as bad as you in terms of the other
knock on effects, but we certainly saw updates jumping from 10ms up to
3ms, all our external queues backed up and we rejected some updates,
then after a while things quietened down.

We were running Solr 4.3.0 but with Java 6 and the CMS GC.  We swapped to
Java 7, G1 GC (and increased heap size from 8Gb to 12Gb) and the problem
went away.

Now, I admit its not exactly the same as your case, we never had the
follow-on effects, but I'd consider Java 7 and the G1 GC, it has certainly
reduced the spikes in our indexing times.

We run the following settings now (the usual caveats apply, it might not
work for you).

GC_OPTIONS=-XX:+AggressiveOpts -XX:+UseG1GC -XX:+UseStringCache
-XX:+OptimizeStringConcat -XX:-UseSplitVerifier -XX:+UseNUMA
-XX:MaxGCPauseMillis=50 -XX:GCPauseIntervalMillis=1000

I set the MaxGCPauseMillis/GCPauseIntervalMillis to try to minimise
application pauses, that's our goal, if we have to use more memory in the
short term then so be it, but we couldn't afford application pauses,
because we are using NRT (soft commits every 1s, hard commits every 60s)
and we get a lot of updates.

I know there have been other discussion on G1 and it has received mixed
results overall, but for us, it seems to be a winner.

Hope that helps,


On 10 July 2013 08:32, Jed Glazner jglaz...@adobe.com wrote:

 We are planning an upgrade to 4.4 but it's still weeks out. We offer a
 high availability search service and there are a number of changes in
4.4
 that are not backward compatible. (i.e. Clusterstate.json and no
solr.xml)
 So there must be lots of testing, additionally this upgrade cannot be
 performed without downtime.

 Regardless, I need to find a band-aid right now.  Does anyone know if
it's
 possible to set the timeout for distributed update request to/from
leader.
  Currently we see it's set to 0.  Maybe via -D startup param, or
something?

 Jed

 On 7/10/13 1:23 AM, Otis Gospodnetic otis.gospodne...@gmail.com
wrote:

 Hi Jed,
 
 This is really with Solr 4.0?  If so, it may be wiser to jump on 4.4
 that is about to be released.  We did not have fun working with 4.0 in
 SolrCloud mode a few months ago.  You will save time, hair, and money
 if you convince your manager to let you use Solr 4.4. :)
 
 Otis
 --
 Solr  ElasticSearch Support -- http://sematext.com/
 Performance Monitoring -- http://sematext.com/spm
 
 
 
 On Tue, Jul 9, 2013 at 4:44 PM, Jed Glazner jglaz...@adobe.com wrote:
  Hi Shawn,
 
  I have been trying to duplicate this problem without success for the
 last 2 weeks which is one reason I'm getting flustered.   It seems
 reasonable to be able to duplicate it but I can't.
 
   We do have a story to upgrade but that is still weeks if not months
 before that gets rolled out to production.
 
  We have another cluster running the same version but with 8 shards
and
 8 replicas with each shard at 100gb and more load and more indexing
 requests without this problem but we send docs in batches here and all
 fields are stored.   Where as the trouble index has only 1 or 2 stored
 fields and only send docs 1 at a time.
 
  Could that have anything to do with it?
 
  Jed
 
 
  Von Samsung Mobile gesendet
 
 
 
   Ursprüngliche Nachricht 
  Von: Shawn Heisey s...@elyograg.org
  Datum: 07.09.2013 18:33 (GMT+01:00)
  An: solr-user@lucene.apache.org
  Betreff: Re: Solr Hangs During Updates for over 10 minutes
 
 
  On 7/9/2013 9:50 AM, Jed Glazner wrote:
  I'll give you the high level before delving deep into setup etc. I
 have been struggeling at work with a seemingly random problem when
solr
 will hang for 10-15 minutes during updates.  This outage always seems
 to immediately be proceeded by an EOF exception on  the replica.
Then
 10-15 minutes later we see an exception on the leader for a socket
 timeout to the replica.  The leader will then tell the replica to
 recover which in most cases it does and then the outage is over.
 
  Here are the setup details:
 
  We are currently using Solr 4.0.0 with an external ZK ensemble of 5
 machines.
 
  After 4.0.0 was released, a *lot* of problems with SolrCloud surfaced
  and have since been fixed.  You're five releases and about nine
months
  behind what's current.  My recommendation: Upgrade to 4.3.1, ensure
your
  configuration is up to date with changes to the example config
between
  4.0.0 and 4.3.1, and reindex.  Ideally, you should set up a 4.0.0
  testbed, duplicate your current problem, and upgrade the testbed to
see
  if the problem goes away.  A testbed will also give you practice for
a
  smooth upgrade of your production system.
 
  Thanks,
  Shawn
 





Re: Solr Hangs During Updates for over 10 minutes

2013-07-10 Thread Jed Glazner
It is certainly 'more' possible, as we have additional code that revolves
around reading the clusterstate.json and since solr decided to change the
format of the clusterstate.json from 4.0 to 4.1 it requires additional
code changes to our service since the solrj lib from 4.0 isn't compatible
with anything after 4.0 due to the clusterstate.json change.  I can
however run java7 with these GC in a dev env under load to see if they
blow up or if it's even possible, and then roll it out to the replica, and
then to to the leader. I cannot however do this with a solr upgrade
without significant coding changes to our service, which would require us
to roll out new code for our service, as well as new solr instances.

So, while it's 'just as risky' as you say, it's 'less risky' than a new
version of java and is possible to implement without downtime.

It is actually something of a pain point that the upgrade path to
solrcloud seems to frequently require downtime. (clusterstate.json changes
in 4.1, and then again this big change in 4.4 with no solr.xml).

So we'll do what we can quickly to see if we can 'band-aid' the problem
until we can upgrade to solr 4.4  Speaking of band-aids - does anyone know
of a way to change the socket timeout/connection timeout for distributed
updates?

Jed.

On 7/10/13 2:38 PM, Erick Erickson erickerick...@gmail.com wrote:

Jed:

I'm not sure changing Java runtime is any less scary than upgrading
Solr

Wait, I know! Ask your manager if you can do both at once evil smirk. I
have
a  t-shirt that says I don't test, but when I do it's in production...

Erick

On Wed, Jul 10, 2013 at 8:08 AM, Jed Glazner jglaz...@adobe.com wrote:
 Hey Daniel,

 Thanks for the response.  I think we'll give this a try to see if this
 helps.

 Jed.

 On 7/10/13 10:48 AM, Daniel Collins danwcoll...@gmail.com wrote:

We had something similar in terms of update times suddenly spiking up
for
no obvious reason.  We never got quite as bad as you in terms of the
other
knock on effects, but we certainly saw updates jumping from 10ms up to
3ms, all our external queues backed up and we rejected some updates,
then after a while things quietened down.

We were running Solr 4.3.0 but with Java 6 and the CMS GC.  We swapped
to
Java 7, G1 GC (and increased heap size from 8Gb to 12Gb) and the problem
went away.

Now, I admit its not exactly the same as your case, we never had the
follow-on effects, but I'd consider Java 7 and the G1 GC, it has
certainly
reduced the spikes in our indexing times.

We run the following settings now (the usual caveats apply, it might not
work for you).

GC_OPTIONS=-XX:+AggressiveOpts -XX:+UseG1GC -XX:+UseStringCache
-XX:+OptimizeStringConcat -XX:-UseSplitVerifier -XX:+UseNUMA
-XX:MaxGCPauseMillis=50 -XX:GCPauseIntervalMillis=1000

I set the MaxGCPauseMillis/GCPauseIntervalMillis to try to minimise
application pauses, that's our goal, if we have to use more memory in
the
short term then so be it, but we couldn't afford application pauses,
because we are using NRT (soft commits every 1s, hard commits every 60s)
and we get a lot of updates.

I know there have been other discussion on G1 and it has received mixed
results overall, but for us, it seems to be a winner.

Hope that helps,


On 10 July 2013 08:32, Jed Glazner jglaz...@adobe.com wrote:

 We are planning an upgrade to 4.4 but it's still weeks out. We offer a
 high availability search service and there are a number of changes in
4.4
 that are not backward compatible. (i.e. Clusterstate.json and no
solr.xml)
 So there must be lots of testing, additionally this upgrade cannot be
 performed without downtime.

 Regardless, I need to find a band-aid right now.  Does anyone know if
it's
 possible to set the timeout for distributed update request to/from
leader.
  Currently we see it's set to 0.  Maybe via -D startup param, or
something?

 Jed

 On 7/10/13 1:23 AM, Otis Gospodnetic otis.gospodne...@gmail.com
wrote:

 Hi Jed,
 
 This is really with Solr 4.0?  If so, it may be wiser to jump on 4.4
 that is about to be released.  We did not have fun working with 4.0
in
 SolrCloud mode a few months ago.  You will save time, hair, and money
 if you convince your manager to let you use Solr 4.4. :)
 
 Otis
 --
 Solr  ElasticSearch Support -- http://sematext.com/
 Performance Monitoring -- http://sematext.com/spm
 
 
 
 On Tue, Jul 9, 2013 at 4:44 PM, Jed Glazner jglaz...@adobe.com
wrote:
  Hi Shawn,
 
  I have been trying to duplicate this problem without success for
the
 last 2 weeks which is one reason I'm getting flustered.   It seems
 reasonable to be able to duplicate it but I can't.
 
   We do have a story to upgrade but that is still weeks if not
months
 before that gets rolled out to production.
 
  We have another cluster running the same version but with 8 shards
and
 8 replicas with each shard at 100gb and more load and more indexing
 requests without this problem but we send docs in batches here and
all
 fields are stored.   Where

AW: Solr Hangs During Updates for over 10 minutes

2013-07-10 Thread Jed Glazner
Hi Shawn this code is für the solrj lib which we already use.

I'm talking about solr s internal communication from leader to replica via the 
DistributedCmdUpdate class.  I want to force the leader to time out after a 
fixed period instead of waiting for 15 minutes für the server to figure out the 
other end of the socket was closed.I don't know of any flags or settings in 
the solrconfig.xml to do this or if it's even possible with out modifying 
source code.

Jed

Von Samsung Mobile gesendet



 Ursprüngliche Nachricht 
Von: Shawn Heisey s...@elyograg.org
Datum: 07.10.2013 17:35 (GMT+01:00)
An: solr-user@lucene.apache.org
Betreff: Re: Solr Hangs During Updates for over 10 minutes


On 7/10/2013 6:57 AM, Jed Glazner wrote:
 So we'll do what we can quickly to see if we can 'band-aid' the problem
 until we can upgrade to solr 4.4  Speaking of band-aids - does anyone know
 of a way to change the socket timeout/connection timeout for distributed
 updates?

If you need to change HttpClient parameters for CloudSolrServer, here's
how you can do it:

String zkHost =
zk1.REDACTED.com:2181,zk2.REDACTED.com:2181,zk3.REDACTED.com:2181/chroot;
ModifiableSolrParams params = new ModifiableSolrParams();
params.set(HttpClientUtil.PROP_MAX_CONNECTIONS, 1000);
params.set(HttpClientUtil.PROP_MAX_CONNECTIONS_PER_HOST, 200);
params.set(HttpClientUtil.PROP_SO_TIMEOUT, 30);
params.set(HttpClientUtil.PROP_CONNECTION_TIMEOUT, 5000);
HttpClient client = HttpClientUtil.createClient(params);
ResponseParser parser = new BinaryResponseParser();
LBHttpSolrServer lbServer = new LBHttpSolrServer(client, parser);
CloudSolrServer server = new CloudSolrServer(zkHost, lbServer);

Thanks,
Shawn



Solr Hangs During Updates for over 10 minutes

2013-07-09 Thread Jed Glazner
I'll give you the high level before delving deep into setup etc. I have been 
struggeling at work with a seemingly random problem when solr will hang for 
10-15 minutes during updates.  This outage always seems to immediately be 
proceeded by an EOF exception on  the replica.  Then 10-15 minutes later we see 
an exception on the leader for a socket timeout to the replica.  The leader 
will then tell the replica to recover which in most cases it does and then the 
outage is over.

Here are the setup details:

We are currently using Solr 4.0.0 with an external ZK ensemble of 5 machines. 
We have 2 active collections each with only 1 shard (we have in total about 15 
collections but most are empty or have less than 100 docs). The first index 
(collection1) is 6.5GB and has ~18M documents.  The 2nd index (collection2) is 
9GB and has about 13M documents. In all cases the leader resides on 1 server 
and the replica resides on the other.  Both servers are AWS XL High Mem 
instances. (8 CPUs @ 2.67Ghz, 70GB Ram) with the index residing on a 1TB raid 
10 using ephemeral storage disks.  We are starting solr using the embedded 
jetty with the following java memory and GC options:

-Xmx16382m -Xms4092m -XX:MaxPermSize=256m -Xss256k -XX:NewSize=1536m 
-XX:SurvivorRatio=16 -XX:+DisableExplicitGC -XX:+UseConcMarkSweepGC 
-XX:ParallelCMSThreads=2 -XX:+CMSClassUnloadingEnabled 
-XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingOccupancyFraction=80 
-XX:+CMSParallelRemarkEnabled

Both collections receive a constant stream of updates ~10k per hour (both 
adds/deletes).  Approximately once per day the following events transpire:


 1.  We see a log entry for a distributed update that takes just over 5 ms 
followed by an EOF write exception on the replica. In all cases this exception 
is triggered by an update to the 9GB collection.
 2.  Occasionally we'll see a 503 shard update error on the leader but usually 
not.
 3.  Approximately 15 minutes after this exception we see a timeout error for a 
this distributed update request on the leader.
 4.  The leader then creates a new connection and tells the replica to recover, 
which it does and everything is OK again.
 5.  During the 15 minute window from when the replica throws the EOF until the 
SocketTimeout by the leader no other updates are processed:

ERROR ON REPLICA:

Jul 8, 2013 6:38:16 PM org.apache.solr.core.SolrCore execute
INFO: [collection2_0] webapp=/solr path=/update 
params={distrib.from=http://Solr4-1-1.domain.com:8983/solr/collection2_0/update.distrib=FROMLEADERwt=javabinversion=2}
 status=0 QTime=50012

Jul 8, 2013 6:38:16 PM org.apache.solr.common.SolrException log
SEVERE: null:org.eclipse.jetty.io.EofException
at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:154)
at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:101)
at 
org.apache.solr.common.util.FastOutputStream.flush(FastOutputStream.java:203)
at 
org.apache.solr.common.util.FastOutputStream.flushBuffer(FastOutputStream.java:196)
at 
org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:94)
at 
org.apache.solr.response.BinaryResponseWriter.write(BinaryResponseWriter.java:49)
at 
org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:404)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:289)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)
at org.eclipse.jetty.server.Server.handle(Server.java:351)
at 
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454)
at 
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47)
at 

AW: Solr Hangs During Updates for over 10 minutes

2013-07-09 Thread Jed Glazner
Hi Shawn,

I have been trying to duplicate this problem without success for the last 2 
weeks which is one reason I'm getting flustered.   It seems reasonable to be 
able to duplicate it but I can't.

 We do have a story to upgrade but that is still weeks if not months before 
that gets rolled out to production.

We have another cluster running the same version but with 8 shards and 8 
replicas with each shard at 100gb and more load and more indexing requests 
without this problem but we send docs in batches here and all fields are 
stored.   Where as the trouble index has only 1 or 2 stored fields and only 
send docs 1 at a time.

Could that have anything to do with it?

Jed


Von Samsung Mobile gesendet



 Ursprüngliche Nachricht 
Von: Shawn Heisey s...@elyograg.org
Datum: 07.09.2013 18:33 (GMT+01:00)
An: solr-user@lucene.apache.org
Betreff: Re: Solr Hangs During Updates for over 10 minutes


On 7/9/2013 9:50 AM, Jed Glazner wrote:
 I'll give you the high level before delving deep into setup etc. I have been 
 struggeling at work with a seemingly random problem when solr will hang for 
 10-15 minutes during updates.  This outage always seems to immediately be 
 proceeded by an EOF exception on  the replica.  Then 10-15 minutes later we 
 see an exception on the leader for a socket timeout to the replica.  The 
 leader will then tell the replica to recover which in most cases it does and 
 then the outage is over.

 Here are the setup details:

 We are currently using Solr 4.0.0 with an external ZK ensemble of 5 machines.

After 4.0.0 was released, a *lot* of problems with SolrCloud surfaced
and have since been fixed.  You're five releases and about nine months
behind what's current.  My recommendation: Upgrade to 4.3.1, ensure your
configuration is up to date with changes to the example config between
4.0.0 and 4.3.1, and reindex.  Ideally, you should set up a 4.0.0
testbed, duplicate your current problem, and upgrade the testbed to see
if the problem goes away.  A testbed will also give you practice for a
smooth upgrade of your production system.

Thanks,
Shawn



Re: Writing new indexes from index readers slow!

2013-03-22 Thread Jed Glazner
Thanks Otis,

I had not considered that approach, however not all of our fields are stored so 
that's not going to work for me.

I'm wondering if its slow because there is just the one reader getting passed 
to the index writer... I noticed today that the addIndexes method can take an 
array of readers.  Maybe if I can send in an array of readers for the 
individual segments in the index...

I'll try that tomorrow.

Jed

Otis Gospodnetic otis.gospodne...@gmail.com wrote:



Jed,

While this is something completely different, have you considered using 
SolrEntityProcessor instead? (assuming all your fields are stored)
http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Thu, Mar 21, 2013 at 2:25 PM, Jed Glazner 
jglaz...@adobe.commailto:jglaz...@adobe.com wrote:
Hey Hey Everybody!,

I'm not sure if I should have posted this to the developers list... if i'm 
totally barking up the wrong tree here, please let me know!

Anywho, I've developed a command line utility based on the 
MultiPassIndexSplitter class from the lucene library, but I'm finding that on 
our large index (350GB), it's taking WAY to long to write the newly split 
indexes! It took 20.5 hours for execution to finish. I should note that solr is 
not running while I'm splitting the index. Because solr can't really be running 
while I run this tool performance is critical as our service will be down!

I am aware that there is an api currently under development on trunk in solr 
cloud (https://issues.apache.org/jira/browse/SOLR-3755) but I need something 
now as our large index wreaking havoc on our service.

Here is some basic context info:

The Index:
==
Solr/Lucene 4.1
Index Size: 350GB
Documents: 185,194,528

The Hardware (http://aws.amazon.com/ec2/instance-types/):
===
AWS High-Memory X-Large (m2.xlarge) instance
CPU: 8 cores (2 virtual cores with 3.25 EC2 Compute Units each)
17.1 GB ram
1.2TB ebs raid

The Process (splitting 1 index into 8):
===
I'm trying to split this index into 8 separate indexes using this tool.  To do 
this I create 8 worker threads.  Each thread creates gets a new 
FakeDeleteIndexReader object, and loops over every document, and uses a hash 
algorithm to decide if it should keep or delete the document.  Note that the 
documents are not actually deleted at this point because (as I understand it) 
the  FakeDeleteIndexReader emulates deletes without actually modifying the 
underlying index.

After each worker has determined which documents it should keep I create a new 
Directory object, Instanciate a new IndexWriter, and pass the 
FakeDeleteIndexReader object to the addIndexs method. (this is the part that 
takes forever!)

It only takes about an hour for all of the threads to hash/delete the documents 
it doesn't want. However it takes 19+ hours to write all of the new indexes!  
Watching iowait  The disk doesn't look to be over worked (about 85% idle), so 
i'm baffled as to why it would take that long!  I've tried running the write 
operations inside the worker threads, and serialy with no real difference!

Here is the relevant code that I'm using to write the indexes:

/**
 * Creates/merges a new index with a FakeDeleteIndexReader. The reader should 
have marked/deleted all
 * of the documents that should not be included in this new index. When the 
index is written/committed
 * these documents will be removed.
 *
 * @param directory
 *The directory object of the new index
 * @param version
 *The lucene version of the index
 * @param reader
 *A FakeDeleteIndexReader that contains lots of uncommitted deletes.
 * @throws IOException
 */
private void writeToDisk(Directory directory, Version version, 
FakeDeleteIndexReader reader) throws IOException
{
IndexWriterConfig cfg = new IndexWriterConfig(version, new 
WhitespaceAnalyzer(version));
cfg.setOpenMode(OpenMode.CREATE);

IndexWriter w = new IndexWriter(directory, cfg);
w.addIndexes(reader);
w.commit();
w.close();
reader.close();
}

Any Ideas??  I'm happy to share more snippets of source code if that is 
helpful..
--
[cid:part1.06000602.0109@adobe.com]

Jed Glazner
Sr. Software Engineer
Adobe Social

385.221.1072tel:385.221.1072 (tel)
801.360.0181tel:801.360.0181 (cell)
jglaz...@adobe.commailto:jglaz...@adobe.com

550 East Timpanogus Circle
Orem, UT 84097-6215, USA
www.adobe.comhttp://www.adobe.com






Writing new indexes from index readers slow!

2013-03-21 Thread Jed Glazner

  
  
Hey Hey Everybody!,

I'm not sure if I should have posted this to the developers list...
if i'm totally barking up the wrong tree here, please let me know!

Anywho, I've developed a command line utility based on the
MultiPassIndexSplitter class from the lucene library, but I'm
finding that on our large index (350GB), it's taking WAY to long to
write the newly split indexes! It took 20.5 hours for execution to
finish. I should note that solr is not running while I'm splitting
the index. Because solr can't really be running while I run this
tool performance is critical as our service will be down! 

I am aware that there is an api currently under development on trunk
in solr cloud (https://issues.apache.org/jira/browse/SOLR-3755) but
I need something now as our large index wreaking havoc on our
service.

Here is some basic context info:

The Index:
==
Solr/Lucene 4.1
Index Size: 350GB
Documents: 185,194,528

The Hardware (http://aws.amazon.com/ec2/instance-types/):
===
AWS High-Memory X-Large (m2.xlarge) instance
CPU: 8 cores (2 virtual cores with 3.25 EC2 Compute Units each)
17.1 GB ram
1.2TB ebs raid

The Process (splitting 1 index into 8):
===
I'm trying to split this index into 8 separate indexes using this
tool. To do this I create 8 worker threads. Each thread creates
gets a new FakeDeleteIndexReader object, and loops over every
document, and uses a hash algorithm to decide if it should keep or
delete the document. Note that the documents are not actually
deleted at this point because (as I understand it) the
FakeDeleteIndexReader emulates deletes without actually modifying
the underlying index.

After each worker has determined which documents it should keep I
create a new Directory object, Instanciate a new IndexWriter, and
pass the FakeDeleteIndexReader object to the addIndexs method. (this
is the part that takes forever!)

It only takes about an hour for all of the threads to hash/delete
the documents it doesn't want. However it takes 19+ hours to write
all of the new indexes! Watching iowait The disk doesn't look to
be over worked (about 85% idle), so i'm baffled as to why it would
take that long! I've tried running the write operations inside the
worker threads, and serialy with no real difference!

Here is the relevant code that I'm using to write the indexes:

/**
* Creates/merges a new index with a FakeDeleteIndexReader. The
reader should have marked/deleted all 
* of the documents that should not be included in this new index.
When the index is written/committed 
* these documents will be removed.
*
* @param directory
* The directory object of the new index
* @param version
* The lucene version of the index
* @param reader
* A FakeDeleteIndexReader that contains lots of
uncommitted deletes.
* @throws IOException
*/
private void writeToDisk(Directory directory, Version version,
FakeDeleteIndexReader reader) throws IOException
{
 IndexWriterConfig cfg = new IndexWriterConfig(version, new
WhitespaceAnalyzer(version));
 cfg.setOpenMode(OpenMode.CREATE);
 
 IndexWriter w = new IndexWriter(directory, cfg);
 w.addIndexes(reader);
 w.commit();
 w.close();
 reader.close();
}

Any Ideas?? I'm happy to share more snippets of source code if that
is helpful..
-- 
  
  
  
  
  
  
  
  
  
  
  
  

  

  

  
  

  

  
Jed Glazner
Sr. Software Engineer
Adobe Social
  
  
385.221.1072 (tel)
801.360.0181 (cell)
jglaz...@adobe.com
  
  
 550 East Timpanogus Circle
Orem, UT 84097-6215, USA
www.adobe.com
  

  

  

  


  

  



Re: How to make a server become a replica / leader for a collection at startup

2012-08-19 Thread Jed Glazner
Hey Mark,

Thanks for the extra effort in responding :)

Are you ok if I file a jira ticket and complete this feature on trunk?  We need 
this feature for a project.

Jed Glazner
Sr. Software Engineer
Adobe
jglaz...@adobe.com

- Reply message -
From: Mark Miller markrmil...@gmail.com
To: solr-user@lucene.apache.org solr-user@lucene.apache.org
Subject: How to make a server become a replica / leader for a collection at 
startup
Date: Sun, Aug 19, 2012 9:11 am



Hmm...last email was blocked from the list as spam :)

Let me try again forcing plain text:


Hey Jed,

I think what you are looking for is something I have proposed, but is not
implemented yet. We started with a fairly simple collections API since we
just wanted to make sure we had something in 4.0.

I would like it to be better though. My proposal was that when you create a
new collection with n shards and z replicas, that should be recorded in
ZooKeeper by the Overseer. The Overseer should then watch for when a new
node comes up - then a trigger a process that compares the config for the
collection against the real world - and remove or add based on that info.

I don't think it's that difficult to do, but given a lot of other things we
are working on, and the worry of destabilizing anything before the 4
release, I think it's more likely to come in a point release later. It's
not super complicated work, but there are some tricky corner cases I think.

- Mark


How to make a server become a replica / leader for a collection at startup

2012-08-17 Thread Jed Glazner

  
  
Hello All,

I'm working to solve an interesting problem. The problem that I
have is that when I pull a server out of the cloud (to do
maintenance say) and then bring it back up, it won't automatically
sync up with zookeeper and become a leader or replica for any
collections that I have created while it was off-line even though I
specified a number of shards or replicas higher than the number of
servers that are registered with zookeeper.

Here is my setup:
External Zookeeper(v 3.3.5) Ensemble (zk1, zk2, zk3) 
SolrCloud (4.0.0-BETA) with 2 shards and 2 replicas (shard1, shard2,
shard1a, shard2a)

Here is the detailed scenario:
I create a new collection name 'collection2' using the collection
api and specify 2 shards and 2 replicas. (curl
'http://shard1:8983/solr/admin/collections?action="">')The
result of the call creates (as I would expect) 2 shards and 2
replicas.

I then push some docs into 'collection2' and I see the documents are
distributed between shard1 and shard2 and are replicated to 1a and
2a. So far so good.

Now to simulate a node failure I take down shard1a while pushing
some more docs into 'collection2'. Additionally while shard1a is
down I also create a new collection named 'collection3' using the
collections api and specify 2 shards and 2 replicas. The result of
the call creates (as I would expect) 2 shards and 1 replica since
shard1a is down there are not enough servers to create all of the
replicas.

Before bringing backup shard1a I push some documents into
'collection3' and see the docs are distributed between shard1 and
shard2 with shard2a replicating shard2. Everything looks great and
working as expected. Thus far.

When I bring shard1a back on-line however, here is what I would expect
to happen:
1. Shard1a registers with zookeeper, zookeeper assigns it as a
replica of shard1 for 'collection2' (it knows about collection2
because it's stored in the solr.xml)
2. Shard1a asks zookeeper if there are any collections that have
missing replicas, or not enough shards. 
2. Zookeeper responds that 'collection3' on shard1 doesn't have a
replica (remember I created the collection with 2 replicas but only
one is present).
4. Shard1a creates a new core and becomes a replica for
'collection3' on shard1
5. Shard1a synchronizes with shard1 and replicates the missing
documents for 'collection2' and 'collection3'.

However here is what really happens:
1. shard1a registers with zookeeper and is assigned a replica of
shard1 for 'collection2'
2. shard1a synchronizes with shard1 and replicates the missing
documents for 'collection2'
Nothing else happens.

How I can I make shard1a automatically become a replica or a leader
for missing cores within a collection when it comes online? 

-- 
  
  
  
  
  
  
  
  
  
  
  
  

  

  

  
  

  

  
            Jed Glazner
Sr. Software Engineer
Adobe Social
  
  
385.221.1072 (tel)
801.360.0181 (cell)
jglaz...@adobe.com
  
  
 550 East Timpanogus Circle
Orem, UT 84097-6215, USA
www.adobe.com
  

  

  

  


  

  



Re: Replicaiton Fails with Unreachable error when master host is responding.

2011-05-03 Thread Jed Glazner
So it turns out that it's the host names.  According the DNS RFC 
underscores are not valid in host names. Most DNS servers now support 
them, but it's not in the rfc strictly speaking.  So there must be 
something in the underlying java classes that bork when using 
underscores in host names, though  I didn't see anything in the stack 
trace that indicated an invalid host name exception. That was most the 
issue though.  Once I changed the host name to the master's IP address  
replication worked great.  So I'm working with our IT to remove 
underscores from our host names.


Just thought I would post my answer here in case anyone else had that 
issue.


Thanks.

Jed.

On 04/28/2011 02:03 PM, Mike Sokolov wrote:

No clue. Try wireshark to gather more data?

On 04/28/2011 02:53 PM, Jed Glazner wrote:

Anybody?

On 04/27/2011 01:51 PM, Jed Glazner wrote:

Hello All,

I'm having a very strange problem that I just can't figure out. The
slave is not able to replicate from the master, even though the master
is reachable from the slave machine.  I can telnet to the port it's
running on, I can use text based browsers to navigate the master from
the slave. I just don't understand why it won't replicate.  The admin
screen gives me an Unreachable in the status, and in the log there is an
exception thrown.  Details below:

BACKGROUND:

OS: Arch Linux
Solr Version: svn revision 1096983 from
https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/
No custom plugins, just whatever came with the version above.
Java Setup:

java version 1.6.0_22
OpenJDK Runtime Environment (IcedTea6 1.10) (ArchLinux-6.b22_1.10-1-x86_64)
OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)

We have 3 cores running, all 3 cores are not able to replicate.

The admin on the slave shows  the Master as
http://solr-master-01_dev.la.bo:8983/solr/music/replication  - *Unreachable*
Replicaiton def on the slave

529requestHandler name=/replication class=solr.ReplicationHandler
530lst name=${slave:slave}
531str
name=masterUrlhttp://solr-master-01_dev.la.bo:8983/solr/music/replication/str
532str name=pollInterval00:15:00/str
533/lst
534/requestHandler

Replication def on the master:

529requestHandler name=/replication class=solr.ReplicationHandler
530lst name=${master:master}
531str name=replicateAftercommit/str
532str name=replicateAfterstartup/str
533str name=confFilesschema.xml,stopwords.txt/str
534/lst
535/requestHandler

Below is the log start to finish for replication attempts, note that it
says connection refused, however, I can telnet to 8983 from the slave to
the master, so I know it's up and reachable from the slave:

telnet solr-master-01_dev.la.bo 8983
Trying 172.12.65.58...
Connected to solr-master-01_dev.la.bo.
Escape character is '^]'.

I double checked the master to make sure that it didn't have replication
turned off, and it's not.  So I should be able to replicate but it
can't.  I just dont' know what else to check.  The log from the slave is
below.

Apr 27, 2011 7:39:45 PM org.apache.solr.request.SolrQueryResponseinit
WARNING: org.apache.solr.request.SolrQueryResponse is deprecated. Please
use the corresponding class in org.apache.solr.response
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing
request: Connection refused
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: Retrying request
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing
request: Connection refused
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: Retrying request
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing
request: Connection refused
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: Retrying request
Apr 27, 2011 7:39:45 PM org.apache.solr.handler.ReplicationHandler
getReplicationDetails
WARNING: Exception while invoking 'details' method for replication on
master
java.net.ConnectException: Connection refused
   at java.net.PlainSocketImpl.socketConnect(Native Method)
   at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
   at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
   at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
   at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
   at java.net.Socket.connect(Socket.java:546)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43

Re: Replicaiton Fails with Unreachable error when master host is responding.

2011-04-28 Thread Jed Glazner


  
  
Anybody?

On 04/27/2011 01:51 PM, Jed Glazner wrote:

  Hello All,

I'm having a very strange problem that I just can't figure out. The
slave is not able to replicate from the master, even though the master
is reachable from the slave machine.  I can telnet to the port it's
running on, I can use text based browsers to navigate the master from
the slave. I just don't understand why it won't replicate.  The admin
screen gives me an Unreachable in the status, and in the log there is an
exception thrown.  Details below:

BACKGROUND:

OS: Arch Linux
Solr Version: svn revision 1096983 from
https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/
No custom plugins, just whatever came with the version above.
Java Setup:

java version "1.6.0_22"
OpenJDK Runtime Environment (IcedTea6 1.10) (ArchLinux-6.b22_1.10-1-x86_64)
OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)

We have 3 cores running, all 3 cores are not able to replicate.

The admin on the slave shows  the Master as
http://solr-master-01_dev.la.bo:8983/solr/music/replication - *Unreachable*
Replicaiton def on the slave

  529 requestHandler name="/replication" class="solr.ReplicationHandler" 
  530 lst name="${slave:slave}"
  531 str
name="masterUrl"http://solr-master-01_dev.la.bo:8983/solr/music/replication/str
  532 str name="pollInterval"00:15:00/str
  533 /lst
  534 /requestHandler

Replication def on the master:

  529 requestHandler name="/replication" class="solr.ReplicationHandler" 
  530 lst name="${master:master}"
  531 str name="replicateAfter"commit/str
  532 str name="replicateAfter"startup/str
  533 str name="confFiles"schema.xml,stopwords.txt/str
  534 /lst
  535 /requestHandler

Below is the log start to finish for replication attempts, note that it
says connection refused, however, I can telnet to 8983 from the slave to
the master, so I know it's up and reachable from the slave:

telnet solr-master-01_dev.la.bo 8983
Trying 172.12.65.58...
Connected to solr-master-01_dev.la.bo.
Escape character is '^]'.

I double checked the master to make sure that it didn't have replication
turned off, and it's not.  So I should be able to replicate but it
can't.  I just dont' know what else to check.  The log from the slave is
below.

Apr 27, 2011 7:39:45 PM org.apache.solr.request.SolrQueryResponse init
WARNING: org.apache.solr.request.SolrQueryResponse is deprecated. Please
use the corresponding class in org.apache.solr.response
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing
request: Connection refused
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: Retrying request
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing
request: Connection refused
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: Retrying request
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing
request: Connection refused
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector
executeWithRetry
INFO: Retrying request
Apr 27, 2011 7:39:45 PM org.apache.solr.handler.ReplicationHandler
getReplicationDetails
WARNING: Exception while invoking 'details' method for replication on
master
java.net.ConnectException: Connection refused
 at java.net.PlainSocketImpl.socketConnect(Native Method)
 at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
 at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
 at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)
 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
 at java.net.Socket.connect(Socket.java:546)
 at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
 at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at
org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:140)
 at
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125)
 at
org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
 at
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361)
 at
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
 at
org.apache.commons.httpclient.HttpMethodDirector.e

Replicaiton Fails with Unreachable error when master host is responding.

2011-04-27 Thread Jed Glazner

Hello All,

I'm having a very strange problem that I just can't figure out. The 
slave is not able to replicate from the master, even though the master 
is reachable from the slave machine.  I can telnet to the port it's 
running on, I can use text based browsers to navigate the master from 
the slave. I just don't understand why it won't replicate.  The admin 
screen gives me an Unreachable in the status, and in the log there is an 
exception thrown.  Details below:


BACKGROUND:

OS: Arch Linux
Solr Version: svn revision 1096983 from 
https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/

No custom plugins, just whatever came with the version above.
Java Setup:

java version 1.6.0_22
OpenJDK Runtime Environment (IcedTea6 1.10) (ArchLinux-6.b22_1.10-1-x86_64)
OpenJDK 64-Bit Server VM (build 19.0-b09, mixed mode)

We have 3 cores running, all 3 cores are not able to replicate.

The admin on the slave shows  the Master as 
http://solr-master-01_dev.la.bo:8983/solr/music/replication - *Unreachable*

Replicaiton def on the slave

 529 requestHandler name=/replication class=solr.ReplicationHandler 
 530 lst name=${slave:slave}
 531 str 
name=masterUrlhttp://solr-master-01_dev.la.bo:8983/solr/music/replication/str

 532 str name=pollInterval00:15:00/str
 533 /lst
 534 /requestHandler

Replication def on the master:

 529 requestHandler name=/replication class=solr.ReplicationHandler 
 530 lst name=${master:master}
 531 str name=replicateAftercommit/str
 532 str name=replicateAfterstartup/str
 533 str name=confFilesschema.xml,stopwords.txt/str
 534 /lst
 535 /requestHandler

Below is the log start to finish for replication attempts, note that it 
says connection refused, however, I can telnet to 8983 from the slave to 
the master, so I know it's up and reachable from the slave:


telnet solr-master-01_dev.la.bo 8983
Trying 172.12.65.58...
Connected to solr-master-01_dev.la.bo.
Escape character is '^]'.

I double checked the master to make sure that it didn't have replication 
turned off, and it's not.  So I should be able to replicate but it 
can't.  I just dont' know what else to check.  The log from the slave is 
below.


Apr 27, 2011 7:39:45 PM org.apache.solr.request.SolrQueryResponse init
WARNING: org.apache.solr.request.SolrQueryResponse is deprecated. Please 
use the corresponding class in org.apache.solr.response
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector 
executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing 
request: Connection refused
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector 
executeWithRetry

INFO: Retrying request
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector 
executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing 
request: Connection refused
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector 
executeWithRetry

INFO: Retrying request
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector 
executeWithRetry
INFO: I/O exception (java.net.ConnectException) caught when processing 
request: Connection refused
Apr 27, 2011 7:39:45 PM org.apache.commons.httpclient.HttpMethodDirector 
executeWithRetry

INFO: Retrying request
Apr 27, 2011 7:39:45 PM org.apache.solr.handler.ReplicationHandler 
getReplicationDetails
WARNING: Exception while invoking 'details' method for replication on 
master

java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180)

at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
at java.net.Socket.connect(Socket.java:546)
at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:616)
at 
org.apache.commons.httpclient.protocol.ReflectionSocketFactory.createSocket(ReflectionSocketFactory.java:140)
at 
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:125)
at 
org.apache.commons.httpclient.HttpConnection.open(HttpConnection.java:707)
at 
org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(MultiThreadedHttpConnectionManager.java:1361)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(HttpMethodDirector.java:387)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:171)
at 
org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
at 

Re: HTTP ERROR 400 undefined field: *

2011-02-08 Thread Jed Glazner
So I re-indexed some of the content, but no dice. Per Hoss, I tried 
disabling the TVC and it worked great.  We're not really using tvc right 
now since we made a decision to turn off highlighting for the moment, so 
this isn't a huge deal.  I'll create a new jira issue.


FYI here is my query from the logs:

--this one breaks (undefined field)
webapp=/solr path=/select 
params={explainOther=fl=*,scoreindent=onstart=0q=brucehl.fl=qt=standardwt=standardfq=version=2.2rows=10} 
hits=114 status=400 QTime=21


this one works:
webapp=/solr path=/select 
params={explainOther=indent=onhl.fl=wt=standardversion=2.2rows=10fl=*,scorestart=0q=brucetv=falseqt=standardfq=} 
hits=128 status=0 QTime=48


Though i'm not sure why when the tvc is disabled there are more hits, 
but the qtime is slower.  That's a different issue though, and something 
I can work though.


Thanks for your help.



On 02/07/2011 11:38 AM, Chris Hostetter wrote:

: The stack trace is attached.  I also saw this warning in the logs not sure

 From your attachment...

  853 SEVERE: org.apache.solr.common.SolrException: undefined field: score
  854   at 
org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:142)
  855   at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
  856   at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
  857   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1357)

...this is one of the key pieces of info that was missing from your
earlier email: that you are using the TermVectorComponent.

It's likely that something changed in the TVC on 3x between the two
versions you were using and thta change freaks out now on * or score
in the fl.

you still haven't given us an example of the full URLs you are using that
trigger this error. (it's posisble there is something slightly off in your
syntax - we don't know because you haven't shown us)

All in: this sounds like a newly introduced bug in TVC, please post the
details into a new Jira issue.

as to the warning you asked about...

: Feb 3, 2011 8:14:10 PM org.apache.solr.core.Config getLuceneVersion
: WARNING: the luceneMatchVersion is not specified, defaulting to LUCENE_24
: emulation. You should at some point declare and reindex to at least 3.0,
: because 2.4 emulation is deprecated and will be removed in 4.0. This parameter
: will be mandatory in 4.0.

if you look at the example configs on the 3x branch it should be
explained.  it's basically just a new feature that lets you specify
which quirks of the underlying lucene code you want (so on upgrading you
are in control of wether you eliminate old quirks or not)


-Hoss




Re: HTTP ERROR 400 undefined field: *

2011-02-08 Thread Jed Glazner

here is the ticket:
https://issues.apache.org/jira/browse/SOLR-2352

On 02/08/2011 11:27 AM, Jed Glazner wrote:

So I re-indexed some of the content, but no dice. Per Hoss, I tried
disabling the TVC and it worked great.  We're not really using tvc right
now since we made a decision to turn off highlighting for the moment, so
this isn't a huge deal.  I'll create a new jira issue.

FYI here is my query from the logs:

--this one breaks (undefined field)
webapp=/solr path=/select
params={explainOther=fl=*,scoreindent=onstart=0q=brucehl.fl=qt=standardwt=standardfq=version=2.2rows=10}
hits=114 status=400 QTime=21

this one works:
webapp=/solr path=/select
params={explainOther=indent=onhl.fl=wt=standardversion=2.2rows=10fl=*,scorestart=0q=brucetv=falseqt=standardfq=}
hits=128 status=0 QTime=48

Though i'm not sure why when the tvc is disabled there are more hits,
but the qtime is slower.  That's a different issue though, and something
I can work though.

Thanks for your help.



On 02/07/2011 11:38 AM, Chris Hostetter wrote:

: The stack trace is attached.  I also saw this warning in the logs not sure

  From your attachment...

   853 SEVERE: org.apache.solr.common.SolrException: undefined field: score
   854   at 
org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:142)
   855   at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
   856   at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
   857   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1357)

...this is one of the key pieces of info that was missing from your
earlier email: that you are using the TermVectorComponent.

It's likely that something changed in the TVC on 3x between the two
versions you were using and thta change freaks out now on * or score
in the fl.

you still haven't given us an example of the full URLs you are using that
trigger this error. (it's posisble there is something slightly off in your
syntax - we don't know because you haven't shown us)

All in: this sounds like a newly introduced bug in TVC, please post the
details into a new Jira issue.

as to the warning you asked about...

: Feb 3, 2011 8:14:10 PM org.apache.solr.core.Config getLuceneVersion
: WARNING: the luceneMatchVersion is not specified, defaulting to LUCENE_24
: emulation. You should at some point declare and reindex to at least 3.0,
: because 2.4 emulation is deprecated and will be removed in 4.0. This parameter
: will be mandatory in 4.0.

if you look at the example configs on the 3x branch it should be
explained.  it's basically just a new feature that lets you specify
which quirks of the underlying lucene code you want (so on upgrading you
are in control of wether you eliminate old quirks or not)


-Hoss




Re: HTTP ERROR 400 undefined field: *

2011-02-07 Thread Jed Glazner

Thanks Otis,

I'll give that a try.

Jed.

On 02/06/2011 08:06 PM, Otis Gospodnetic wrote:

Yup, here it is, warning about needing to reindex:

http://twitter.com/#!/lucene/status/28694113180192768

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 

From: Erick Ericksonerickerick...@gmail.com
To: solr-user@lucene.apache.org
Sent: Sun, February 6, 2011 9:43:00 AM
Subject: Re: HTTP ERROR 400 undefined field: *

I *think* that there was a post a while ago saying that if you were
using  trunk 3_x one of the recent changes required re-indexing, but don't
quote me  on that.
Have you tried that?

Best
Erick

On Fri, Feb 4, 2011  at 2:04 PM, Jed Glazner
jglaz...@beyondoblivion.comwrote:


  Sorry for the lack of details.

It's all clear in my head..  :)

We checked out the head revision from the 3.x branch a few  weeks ago (
https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/).  We
picked up r1058326.

We upgraded from a previous  checkout (r960098). I am using our customized
schema.xml and the  solrconfig.xml from the old revision with the new
  checkout.

After upgrading I just copied the data folders from  each core into the new
checkout (hoping I wouldn't have to re-index the  content, as this takes
days).  Everything seems to work fine,  except that now I can't get the

score

to return.

The  stack trace is attached.  I also saw this warning in the logs not  sure
exactly what it's talking about:

Feb 3, 2011  8:14:10 PM org.apache.solr.core.Config getLuceneVersion
WARNING: the  luceneMatchVersion is not specified, defaulting to LUCENE_24
emulation.  You should at some point declare and reindex to at least 3.0,
because  2.4 emulation is deprecated and will be removed in 4.0. This
parameter  will be mandatory in 4.0.

Here is my request handler, the actual  fields here are different than what
is in mine, but I'm a little  uncomfortable publishing how our companies
search service works to the  world:

requestHandler name=standard  class=solr.SearchHandler default=true
lst  name=defaults
str  name=echoParamsexplicit/str
str  name=defTypeedismax/str
bool  name=tvtrue/bool
!-- standard field to query on  --
str name=qffield_a^2 field_b^2 field_c^4/str

!-- automatic phrase boosting! --
  str name=pffield_d^10/str

!-- boost  function --
!--
 we'll comment this out for now becuase we're passing it to
  solr as a paramter.
 Once we finalize the exact function we should move it here
and  take it out of the
 query string.
 --
!--str  name=bflog(linear(field_e,0.001,1))^10/str--
str  name=tie0.1/str
/lst
arr  name=last-components
strtvComponent/str
  /arr
/requestHandler

Anyway   Hopefully this is enough info, let me know if you need more.

  Jed.






On 02/03/2011 10:29  PM, Chris Hostetter wrote:


: I was working on an checkout of  the 3.x branch from about 6 months ago.
: Everything was working  pretty well, but we decided that we should update
and
:  get what was at the head.  However after upgrading, I am now  getting
this

FWIW: please be specific.   head of what? the 3x branch? or trunk?  what
revision in svn  does that corrispond to? (the svnversion command will
tell  you)

: HTTP ERROR 400 undefined field: *
  :
: If I clear the fl parameter (default is set to *, score) then it  works
fine
: with one big problem, no score data.   If I try and set fl=score I get
the same
: error except  it says undefined field: score?!
:
: This works great in  the older version, what changed?  I've googled for
  about
: an hour now and I can't seem to find  anything.

i can't reproduce this using either trunk  (r1067044) or 3x (r1067045)

all of these queries work  just fine...

http://localhost:8983/solr/select/?q=*
 http://localhost:8983/solr/select/?q=solrfl=*,score
 http://localhost:8983/solr/select/?q=solrfl=score
 http://localhost:8983/solr/select/?q=solr

...you'll  have to proivde us with a *lot* more details to help understand
why  you might be getting an error (like: what your configs look like,

what

the request looks like, what the full stack trace of your error  is in the
logs,  etc...)




  -Hoss







Re: HTTP ERROR 400 undefined field: *

2011-02-04 Thread Jed Glazner

Sorry for the lack of details.

It's all clear in my head.. :)

We checked out the head revision from the 3.x branch a few weeks ago 
(https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/). We 
picked up r1058326.


We upgraded from a previous checkout (r960098). I am using our 
customized schema.xml and the solrconfig.xml from the old revision with 
the new checkout.


After upgrading I just copied the data folders from each core into the 
new checkout (hoping I wouldn't have to re-index the content, as this 
takes days).  Everything seems to work fine, except that now I can't get 
the score to return.


The stack trace is attached.  I also saw this warning in the logs not 
sure exactly what it's talking about:


Feb 3, 2011 8:14:10 PM org.apache.solr.core.Config getLuceneVersion
WARNING: the luceneMatchVersion is not specified, defaulting to 
LUCENE_24 emulation. You should at some point declare and reindex to at 
least 3.0, because 2.4 emulation is deprecated and will be removed in 
4.0. This parameter will be mandatory in 4.0.


Here is my request handler, the actual fields here are different than 
what is in mine, but I'm a little uncomfortable publishing how our 
companies search service works to the world:


requestHandler name=standard class=solr.SearchHandler default=true
lst name=defaults
str name=echoParamsexplicit/str
str name=defTypeedismax/str
bool name=tvtrue/bool
!-- standard field to query on --
str name=qffield_a^2 field_b^2 field_c^4 /str

!-- automatic phrase boosting! --
str name=pffield_d^10/str

!-- boost function --
!--
we'll comment this out for now becuase we're passing it 
to solr as a paramter.
Once we finalize the exact function we should move it 
here and take it out of the

query string.
--
!--str name=bflog(linear(field_e,0.001,1))^10/str--
str name=tie0.1/str
/lst
arr name=last-components
strtvComponent/str
/arr
/requestHandler

Anyway  Hopefully this is enough info, let me know if you need more.

Jed.





On 02/03/2011 10:29 PM, Chris Hostetter wrote:

: I was working on an checkout of the 3.x branch from about 6 months ago.
: Everything was working pretty well, but we decided that we should update and
: get what was at the head.  However after upgrading, I am now getting this

FWIW: please be specific.  head of what? the 3x branch? or trunk?  what
revision in svn does that corrispond to? (the svnversion command will
tell you)

: HTTP ERROR 400 undefined field: *
:
: If I clear the fl parameter (default is set to *, score) then it works fine
: with one big problem, no score data.  If I try and set fl=score I get the same
: error except it says undefined field: score?!
:
: This works great in the older version, what changed?  I've googled for about
: an hour now and I can't seem to find anything.

i can't reproduce this using either trunk (r1067044) or 3x (r1067045)

all of these queries work just fine...

http://localhost:8983/solr/select/?q=*
http://localhost:8983/solr/select/?q=solrfl=*,score
http://localhost:8983/solr/select/?q=solrfl=score
http://localhost:8983/solr/select/?q=solr

...you'll have to proivde us with a *lot* more details to help understand
why you might be getting an error (like: what your configs look like, what
the request looks like, what the full stack trace of your error is in the
logs, etc...)




-Hoss


 844 Feb 3, 2011 8:16:58 PM org.apache.solr.core.SolrCore execute
 845 INFO: [music] webapp=/solr path=/select params={explainOther=fl=*,scoreindent=onstart=0q=testhl.fl=qt=standardwt=standardfq=version=2.2rows=10} hits=2201 status=400 QTime=143
 846 Feb 3, 2011 8:17:00 PM org.apache.solr.core.SolrCore execute
 847 INFO: [rovi] webapp=/solr path=/replication params={command=indexversionwt=javabin} status=0 QTime=0
 848 Feb 3, 2011 8:17:00 PM org.apache.solr.core.SolrCore execute
 849 INFO: [rovi] webapp=/solr path=/replication params={command=filelistwt=javabinindexversion=1277332208072} status=0 QTime=0
 850 Feb 3, 2011 8:17:00 PM org.apache.solr.core.SolrCore execute
 851 INFO: [rovi] webapp=/solr path=/replication params={command=indexversionwt=javabin} status=0 QTime=0
 852 Feb 3, 2011 8:17:09 PM org.apache.solr.common.SolrException log
 853 SEVERE: org.apache.solr.common.SolrException: undefined field: score
 854   at org.apache.solr.handler.component.TermVectorComponent.process(TermVectorComponent.java:142)
 855   at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
 856   at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
 857   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1357)
 858   at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
 859   at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
 860   at 

HTTP ERROR 400 undefined field: *

2011-02-03 Thread Jed Glazner

Hey Guys,

I was working on an checkout of the 3.x branch from about 6 months ago. 
Everything was working pretty well, but we decided that we should update 
and get what was at the head.  However after upgrading, I am now getting 
this error through the admin:


HTTP ERROR 400 undefined field: *

If I clear the fl parameter (default is set to *, score) then it works 
fine with one big problem, no score data.  If I try and set fl=score I 
get the same error except it says undefined field: score?!


This works great in the older version, what changed?  I've googled for 
about an hour now and I can't seem to find anything.



Jed.


Solr Highlighting Question

2010-09-08 Thread Jed Glazner

Thanks for taking time to read through this.  I'm using a checkout from

the solr 3.x branch

My problem is with the highlighter and wildcards

I can get the highlighter to work with wild cards just fine, the problem
is that  solr is returning the term matched, when what I want it to do
is highlight the chars in the term that were matched.


Example:

http://192.168.1.75:8983/solr/music/select?indent=onq=name_title:wel*qt=beyondhl=truehl.fl=name_titlef.name_title.hl.usePhraseHighlighter=truef.name_title.hl.highlightMultiTerm=true

The results that come back look like this:

emWelcome/em   to the Jungle

What I want them to look like is this:
emWel/emcome to the Jungle

  From what I gathered by searching the archives is that solr 1.1 used to
do this... Is there a way to get that functionality?

Thanks!



Re: Solr Highlighting Question

2010-09-08 Thread Jed Glazner




Anybody?

On 09/08/2010 11:26 AM, Jed Glazner wrote:

  Thanks for taking time to read through this.  I'm using a checkout from

the solr 3.x branch

My problem is with the highlighter and wildcards

I can get the highlighter to work with wild cards just fine, the problem
is that  solr is returning the term matched, when what I want it to do
is highlight the chars in the term that were matched.


Example:

http://192.168.1.75:8983/solr/music/select?indent=onq=name_title:wel*qt=beyondhl=truehl.fl=name_titlef.name_title.hl.usePhraseHighlighter=truef.name_title.hl.highlightMultiTerm=true

The results that come back look like this:

emWelcome/em   to the Jungle

What I want them to look like is this:
emWel/emcome to the Jungle

   From what I gathered by searching the archives is that solr 1.1 used to
do this... Is there a way to get that functionality?

Thanks!

  



-- 

This email and its attachments (if any) are for the sole use of the
intended recipient, and may contain private, confidential, and
privileged material. Any review, copying, or distribution of this
email, its attachments or the information contained herein is strictly
prohibited. If you are not the intended recipient, please contact the
sender immediately and permanently delete the original and any copies
of this email and any attachments.






Help with partial term highlighting

2010-09-07 Thread Jed Glazner

Hello Everyone,

Thanks for taking time to read through this.  I'm using a checkout from
the solr 3.x branch

My problem is with the highlighter and wildcards, and is exactly the
same as this guy's but I can't find a reply to his problem:

http://search-lucene.com/m/EARFMs6eR4/partial+highlight+wildcardsubj=Re+old+wildcard+highlighting+behaviour

I can get the highlighter to work with wild cards just fine, the problem
is that  solr is returning the term matched, when what I want it to do
is highlight the chars in the term that were matched.

Example:

http://192.168.1.75:8983/solr/music/select?indent=onq=name_title:wel*qt=beyondhl=truehl.fl=name_titlef.name_title.hl.usePhraseHighlighter=truef.name_title.hl.highlightMultiTerm=true

The results that come back look like this:

emWelcome/em  to the Jungle

What I want them to look like is this:
emWel/emcome to the Jungle

 From what I gathered by searching the archives is that solr 1.1 used to
do this... Is there anyway to get what I want without customizing the
highlighting feature?

Thanks!