Re: Indexing problems with BBoxField

2014-11-24 Thread remus
OK, David Smiley now already created an issue for this:

https://issues.apache.org/jira/browse/SOLR-6781

So it really is a bug.
Furthermore, I also had a lot of problems trying to search on the field
after finally getting it indexed. I summarized those here:

https://issues.apache.org/jira/browse/SOLR-6784

On 2014-11-23 14:34, re...@gmx.net wrote:
 Thanks a lot for your reply!
 
 I had »docValues=true« in there before, but then thought I'd try out
 removing it to see if that helped. It didn't, and I forgot to re-add it
 before copying it into the mail.
 So, unfortunately, that's not it.
 
 However, the other one seems to bring us a step closer to the solution:
 After adding
 
 field name=bboxs_field_location_area type=bbox indexed=true
 stored=false multiValued=false/
 
 (even without removing the dynamic fields), this works indeed just fine!
 So, the question is what causes this, and it seems more and more like a
 bug instead of a user error. But I'll wait for a bit more feedback
 before filing a Jira.
 
 On 2014-11-23 14:10, Jack Krupansky wrote:
 A difference I see in your snippet from the example is that you don't
 have docValues=true on the coordinate field type. You wrote:

 fieldType name=_bbox_coord class=solr.TrieDoubleField
 precisionStep=8 stored=false /

 But the example is:

 fieldType name=_bbox_coord class=solr.TrieDoubleField
 precisionStep=8 docValues=true stored=false/

 Also, maybe try a static field rather than dynamic field, although the
 latter should work anyway.

 Please file a Jira to request that Solr give a user-sensible error, not
 a Lucene-level error. I mean, the Solr user has no ability to directly
 invoke the createFields method.

 And now... let's see what David Smiley has to say about all of this!

 -- Jack Krupansky

 -Original Message- From: Thomas Seidl
 Sent: Sunday, November 23, 2014 6:33 AM
 To: solr-user@lucene.apache.org
 Subject: Indexing problems with BBoxField

 Hi all,

 I just downloaded Solr 4.10.2 and wanted to try out the new BBoxField
 type, but couldn't get it to work. The error (with status 400) I get is:

 ERROR: [doc=foo] Error adding field
 'bboxs_field_location_area'='ENVELOPE(25.89, 41.13, 47.07, 35.31)'
 msg=java.lang.IllegalStateException: instead call createFields() because
 isPolyField() is true

 Which, of course, is rather unhelpful for a user.
 The relevant portions of my schema.xml look like this (largely copied
 from [1]:

 fieldType name=bbox class=solr.BBoxField geo=true units=degrees
 numberType=_bbox_coord /
 fieldType name=_bbox_coord class=solr.TrieDoubleField
 precisionStep=8 stored=false /
 dynamicField name=bboxs_* type=bbox indexed=true stored=false
 multiValued=false/

 [1] https://cwiki.apache.org/confluence/display/solr/Spatial+Search

 And the request I send is this:

 add
  doc
field name=idfoo/field
field name=bboxs_field_location_areaENVELOPE(25.89, 41.13,
 47.07, 35.31)/field
  /doc
 /add

 Does anyone have any idea what could be going wrong here?

 Thanks a lot in advance,
 Thomas

 


Re: Indexing problems with BBoxField

2014-11-24 Thread david.w.smi...@gmail.com
Thomas,

Thank you for communicating on the list about your experience and raising
the JIRA issue.  I meant to respond last night but lost the chance.  (and
Jack, thanks for helping Thomas out).  I’ll follow-up to SOLR-6784.
SOLR-6781 now has a bug-fix patch. I’ll apply it later today.

~ David Smiley
Freelance Apache Lucene/Solr Search Consultant/Developer
http://www.linkedin.com/in/davidwsmiley

On Mon, Nov 24, 2014 at 8:43 AM, re...@gmx.net wrote:

 OK, David Smiley now already created an issue for this:

 https://issues.apache.org/jira/browse/SOLR-6781

 So it really is a bug.
 Furthermore, I also had a lot of problems trying to search on the field
 after finally getting it indexed. I summarized those here:

 https://issues.apache.org/jira/browse/SOLR-6784

 On 2014-11-23 14:34, re...@gmx.net wrote:
  Thanks a lot for your reply!
 
  I had »docValues=true« in there before, but then thought I'd try out
  removing it to see if that helped. It didn't, and I forgot to re-add it
  before copying it into the mail.
  So, unfortunately, that's not it.
 
  However, the other one seems to bring us a step closer to the solution:
  After adding
 
  field name=bboxs_field_location_area type=bbox indexed=true
  stored=false multiValued=false/
 
  (even without removing the dynamic fields), this works indeed just fine!
  So, the question is what causes this, and it seems more and more like a
  bug instead of a user error. But I'll wait for a bit more feedback
  before filing a Jira.
 
  On 2014-11-23 14:10, Jack Krupansky wrote:
  A difference I see in your snippet from the example is that you don't
  have docValues=true on the coordinate field type. You wrote:
 
  fieldType name=_bbox_coord class=solr.TrieDoubleField
  precisionStep=8 stored=false /
 
  But the example is:
 
  fieldType name=_bbox_coord class=solr.TrieDoubleField
  precisionStep=8 docValues=true stored=false/
 
  Also, maybe try a static field rather than dynamic field, although the
  latter should work anyway.
 
  Please file a Jira to request that Solr give a user-sensible error, not
  a Lucene-level error. I mean, the Solr user has no ability to directly
  invoke the createFields method.
 
  And now... let's see what David Smiley has to say about all of this!
 
  -- Jack Krupansky
 
  -Original Message- From: Thomas Seidl
  Sent: Sunday, November 23, 2014 6:33 AM
  To: solr-user@lucene.apache.org
  Subject: Indexing problems with BBoxField
 
  Hi all,
 
  I just downloaded Solr 4.10.2 and wanted to try out the new BBoxField
  type, but couldn't get it to work. The error (with status 400) I get is:
 
  ERROR: [doc=foo] Error adding field
  'bboxs_field_location_area'='ENVELOPE(25.89, 41.13, 47.07, 35.31)'
  msg=java.lang.IllegalStateException: instead call createFields() because
  isPolyField() is true
 
  Which, of course, is rather unhelpful for a user.
  The relevant portions of my schema.xml look like this (largely copied
  from [1]:
 
  fieldType name=bbox class=solr.BBoxField geo=true units=degrees
  numberType=_bbox_coord /
  fieldType name=_bbox_coord class=solr.TrieDoubleField
  precisionStep=8 stored=false /
  dynamicField name=bboxs_* type=bbox indexed=true stored=false
  multiValued=false/
 
  [1] https://cwiki.apache.org/confluence/display/solr/Spatial+Search
 
  And the request I send is this:
 
  add
   doc
 field name=idfoo/field
 field name=bboxs_field_location_areaENVELOPE(25.89, 41.13,
  47.07, 35.31)/field
   /doc
  /add
 
  Does anyone have any idea what could be going wrong here?
 
  Thanks a lot in advance,
  Thomas
 
 



Re: Indexing problems with BBoxField

2014-11-23 Thread Jack Krupansky
A difference I see in your snippet from the example is that you don't have 
docValues=true on the coordinate field type. You wrote:


fieldType name=_bbox_coord class=solr.TrieDoubleField precisionStep=8 
stored=false /


But the example is:

fieldType name=_bbox_coord class=solr.TrieDoubleField precisionStep=8 
docValues=true stored=false/


Also, maybe try a static field rather than dynamic field, although the 
latter should work anyway.


Please file a Jira to request that Solr give a user-sensible error, not a 
Lucene-level error. I mean, the Solr user has no ability to directly invoke 
the createFields method.


And now... let's see what David Smiley has to say about all of this!

-- Jack Krupansky

-Original Message- 
From: Thomas Seidl

Sent: Sunday, November 23, 2014 6:33 AM
To: solr-user@lucene.apache.org
Subject: Indexing problems with BBoxField

Hi all,

I just downloaded Solr 4.10.2 and wanted to try out the new BBoxField
type, but couldn't get it to work. The error (with status 400) I get is:

ERROR: [doc=foo] Error adding field
'bboxs_field_location_area'='ENVELOPE(25.89, 41.13, 47.07, 35.31)'
msg=java.lang.IllegalStateException: instead call createFields() because
isPolyField() is true

Which, of course, is rather unhelpful for a user.
The relevant portions of my schema.xml look like this (largely copied
from [1]:

fieldType name=bbox class=solr.BBoxField geo=true units=degrees
numberType=_bbox_coord /
fieldType name=_bbox_coord class=solr.TrieDoubleField
precisionStep=8 stored=false /
dynamicField name=bboxs_* type=bbox indexed=true stored=false
multiValued=false/

[1] https://cwiki.apache.org/confluence/display/solr/Spatial+Search

And the request I send is this:

add
 doc
   field name=idfoo/field
   field name=bboxs_field_location_areaENVELOPE(25.89, 41.13,
47.07, 35.31)/field
 /doc
/add

Does anyone have any idea what could be going wrong here?

Thanks a lot in advance,
Thomas 



Re: Indexing problems with BBoxField

2014-11-23 Thread remus
Thanks a lot for your reply!

I had »docValues=true« in there before, but then thought I'd try out
removing it to see if that helped. It didn't, and I forgot to re-add it
before copying it into the mail.
So, unfortunately, that's not it.

However, the other one seems to bring us a step closer to the solution:
After adding

field name=bboxs_field_location_area type=bbox indexed=true
stored=false multiValued=false/

(even without removing the dynamic fields), this works indeed just fine!
So, the question is what causes this, and it seems more and more like a
bug instead of a user error. But I'll wait for a bit more feedback
before filing a Jira.

On 2014-11-23 14:10, Jack Krupansky wrote:
 A difference I see in your snippet from the example is that you don't
 have docValues=true on the coordinate field type. You wrote:
 
 fieldType name=_bbox_coord class=solr.TrieDoubleField
 precisionStep=8 stored=false /
 
 But the example is:
 
 fieldType name=_bbox_coord class=solr.TrieDoubleField
 precisionStep=8 docValues=true stored=false/
 
 Also, maybe try a static field rather than dynamic field, although the
 latter should work anyway.
 
 Please file a Jira to request that Solr give a user-sensible error, not
 a Lucene-level error. I mean, the Solr user has no ability to directly
 invoke the createFields method.
 
 And now... let's see what David Smiley has to say about all of this!
 
 -- Jack Krupansky
 
 -Original Message- From: Thomas Seidl
 Sent: Sunday, November 23, 2014 6:33 AM
 To: solr-user@lucene.apache.org
 Subject: Indexing problems with BBoxField
 
 Hi all,
 
 I just downloaded Solr 4.10.2 and wanted to try out the new BBoxField
 type, but couldn't get it to work. The error (with status 400) I get is:
 
 ERROR: [doc=foo] Error adding field
 'bboxs_field_location_area'='ENVELOPE(25.89, 41.13, 47.07, 35.31)'
 msg=java.lang.IllegalStateException: instead call createFields() because
 isPolyField() is true
 
 Which, of course, is rather unhelpful for a user.
 The relevant portions of my schema.xml look like this (largely copied
 from [1]:
 
 fieldType name=bbox class=solr.BBoxField geo=true units=degrees
 numberType=_bbox_coord /
 fieldType name=_bbox_coord class=solr.TrieDoubleField
 precisionStep=8 stored=false /
 dynamicField name=bboxs_* type=bbox indexed=true stored=false
 multiValued=false/
 
 [1] https://cwiki.apache.org/confluence/display/solr/Spatial+Search
 
 And the request I send is this:
 
 add
  doc
field name=idfoo/field
field name=bboxs_field_location_areaENVELOPE(25.89, 41.13,
 47.07, 35.31)/field
  /doc
 /add
 
 Does anyone have any idea what could be going wrong here?
 
 Thanks a lot in advance,
 Thomas
 


RE: Indexing problems

2013-04-19 Thread GASPARD Joel
Hello

Thank you for your answer.
We have solved our problem now. I describe it for someone who could encounter a 
similar problem. 

Some of our fields are dynamic, and the name of one of these fields was not 
correct : it was sent to Solr as a java object, eg 
solrInputDocument.addField(myObject, stringValue);

A string representation of this object was displayed in the Solr admin page, 
and that alerted us. We have replaced this wrong field name by the string we 
expect and no more OOME occur.

At least we could test diverse solr configurations.

Regards

Joel Gaspard 



-Message d'origine-
De : Erick Erickson [mailto:erickerick...@gmail.com] 
Envoyé : jeudi 31 janvier 2013 14:00
À : solr-user@lucene.apache.org
Objet : Re: Indexing problems

I'm really surprised you're hitting OOM errors, I suspect you have something 
else pathological in your system. So, I'd start checking things like
- how many concurrent warming searchers you allow
- How big your indexing RAM is set to (we find very little gain over 128M BTW).
- Other load on your Solr server. Are you, for instance, searching on it too?
- what your autocommit characterstics are (think about autocommitting fairly 
often with openSearcher=false).
- have you defined huge caches?
- .

How big are these documents anyway? With 12G of ram, they'd have to be 
absolutely _huge_ to matter much.

Multiple collections should work fine in ZK. I really think you have some 
innocent-looking configuration setting thats bollixing you up, this is not 
expected behavior.

If at all possible, I'd also go with 4.1. I don't really think it's relevant to 
your situation, but there have been a lot of improvements in the code

Best
Erick


RE: Indexing problems

2013-01-31 Thread GASPARD Joel
Hello,

After more tests, we could identify our problem in indexation (Solr 4.0.0).
Indeed our problems are OutOfMemoryErrors. Thinking about Zookeeper connection 
problems was a mistake. We have thought about this because OOME sometimes 
appear in logs after errors on Zookeeper leader election.

Indexing fails when we define several Solr schemas in Zookeeper.
When we define a single schema, indexation works well. It has been tested with 
a single Solr node in the cluster, or with two Solr nodes.
We are facing problems when we upload several configurations in Zookeeper : we 
can create an index for a single collection, but OutOfMemoryErrors are thrown 
when we try to create an index for a second collection with another schema.
Garbage collect logs show a rapid increase of memory consumption, then 
OutOfMemory errors.

Can we define a distinct schema for each collection ?

Thanks !

Joel Gaspard



De : GASPARD Joel [mailto:joel.gasp...@cegedim.com]
Envoyé : mardi 22 janvier 2013 16:30
À : solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org
Objet : Indexing problems

Hello,

We are facing some problems when indexing with Solr 4.0.0 with more than one 
server node and we can't find a way to solve them.
We have 2 nodes of Solr Cloud instances.
They are running in a Zookeeper ensemble (3.4.4 version) with 3 servers 
(another application is deployed on the third server).
We try to index a collection with 1 shard stored in the 2 nodes.
2 other collections with an only shard have already been indexed. The logs for 
this first indexing have been lost but maybe there was a single Solr node when 
the indexing has been made. Each collection contains about 3.000.000 documents 
(16 Go).

When we start adding documents, failures occur very fast, after maybe 2000 
documents, and the solr servers cannot be accessed anymore.
I add to this mail an attachment containing a part of the logs.

When we use Solr Cloud with only one node in a single zookeeper ensemble, we 
don't encounter any problem.



Some precisions on our configuration :
We send about 400 documents per minute.
The documents are added in Solr by two threads on our application, using the 
CloudSolrServer class.
These threads don't call the commit method. We use only the solr config to 
commit. The solrconfig.xml defines for now :
autoCommitmaxTime15000/maxTimeopenSearcherfalse/openSearcher/autoCommit
No soft commit
We have also tried :
autoCommitmaxTime60/maxTimeopenSearcherfalse/openSearcher/autoCommit
autoSoftCommitmaxTime1000/maxTime/autoSoftCommit

The Solr servers are launched with these options :
-Xmx12G -Xms4G
-XX:MaxPermSize=256m -XX:MaxNewSize=356m
-XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+UseParNewGC
-XX:+CMSClassUnloadingEnabled
-XX:MinHeapFreeRatio=10
-XX:MaxHeapFreeRatio=25
-DzkHost=server1:2188,server2:2188,server3:2188

The solr.xml contains zkClientTimeout=6 and zoo.cfg defines a ticktime of 
3000 ms.

The Solr servers on which we are facing some problems contain old collections 
and old cores created for some tests.



Could you give some indications to me ?
Is this a problem in our solr or zookeeper config ?
How could we detect network problems ?
Is there a problem with the VM parameters ? Should we analyse some garbage 
collect logs ?

Thanks in advance.

Joel Gaspard


Re: Indexing problems

2013-01-31 Thread Erick Erickson
I'm really surprised you're hitting OOM errors, I suspect you have
something else pathological in your system. So, I'd start checking things
like
- how many concurrent warming searchers you allow
- How big your indexing RAM is set to (we find very little gain over 128M
BTW).
- Other load on your Solr server. Are you, for instance, searching on it
too?
- what your autocommit characterstics are (think about autocommitting
fairly often with openSearcher=false).
- have you defined huge caches?
- .

How big are these documents anyway? With 12G of ram, they'd have to be
absolutely _huge_ to matter much.

Multiple collections should work fine in ZK. I really think you have some
innocent-looking configuration setting thats bollixing you up, this is not
expected behavior.

If at all possible, I'd also go with 4.1. I don't really think it's
relevant to your situation, but there have been a lot of improvements in
the code

Best
Erick


RE: Indexing problems

2013-01-31 Thread GASPARD Joel
Hello Erick,

Thanks for your answer.

After reading previous subjects on the user list, we had already tried to 
change the parameters we mentioned.

- concurrent warming searchers : we have set the maxWarmingSearchers attribute 
to 2 
maxWarmingSearchers2/maxWarmingSearchers

- we have tried 32 and 64 for the ramBufferSizeMB attribute

- there is no other load on the Solr server, or search when we index

- the autocommit is defined with openSearcher=false, maxTime=60ms, 
maxDocs=6000 - the autoSoftCommit is defined with maxTime=1000
We have already tried to change the softcommit and the commit parameters in 
several ways. We have also tried to commit on the client size.
Ok I try to commit more often.

- we have used cache sizes defined in the example : size=512

The documents size is not too big, I think : 1 million documents produce a 6Go 
index.

Thanks for your answer on multiple collections. I thought multiple collections 
should have the same schema in Zk after reading a wiki page : 
http://wiki.apache.org/solr/NewSolrCloudDesign : The entire cluster must have 
a single schema and solrconfig
Maybe is this page deprecated ?
I also thought that because OOM errors occur only when we index a second 
collection. There is no problem when indexing a single collection.

Going with 4.1 would not be easy for now... We'll think about it.

Thanks.

Joel


-Message d'origine-
De : Erick Erickson [mailto:erickerick...@gmail.com] 
Envoyé : jeudi 31 janvier 2013 14:00
À : solr-user@lucene.apache.org
Objet : Re: Indexing problems

I'm really surprised you're hitting OOM errors, I suspect you have something 
else pathological in your system. So, I'd start checking things like
- how many concurrent warming searchers you allow
- How big your indexing RAM is set to (we find very little gain over 128M BTW).
- Other load on your Solr server. Are you, for instance, searching on it too?
- what your autocommit characterstics are (think about autocommitting fairly 
often with openSearcher=false).
- have you defined huge caches?
- .

How big are these documents anyway? With 12G of ram, they'd have to be 
absolutely _huge_ to matter much.

Multiple collections should work fine in ZK. I really think you have some 
innocent-looking configuration setting thats bollixing you up, this is not 
expected behavior.

If at all possible, I'd also go with 4.1. I don't really think it's relevant to 
your situation, but there have been a lot of improvements in the code

Best
Erick