solr facet fields doesn't honor fq

2012-07-08 Thread Chamnap Chhorn
Hi all,

I have a question related to solr 3.5 on field facet. Here is my query:

http://localhost:8081/solr_new/select?tie=0.1q.alt=*:*q=bankqf=nameaddressfq=
*portal_uuid:+A4E7890F-A188-4663-89EB-176D94DF6774*defType=dismax*
facet=true*facet.field=*location_uuid*facet.field=*sub_category_uuids*

What I get back with field facet are:
1. Some location_uuids which is in the current portal_uuid (has facet count
 0)
2. Some location_uuids are not in the current portal_uuid at all (has facet
count = 0)

It seems that solr doesn't honor the fq at all when returning field facet.
I need to add one more parameter facet.mincount=1 in order to not return
location_uuids facet (2).

I think, solr does faceting on all location_uuid. It should does that
scoping to current portal_uuid. Any idea?

-- 
Chhorn Chamnap
http://chamnap.github.com/


Re: Solr Faceting

2012-07-08 Thread Christian von Wendt-Jensen
You could add this filter directly in the solr query. Here is an example
using SolrJ:


SolrQuery solrQuery = new SolrQuery();
solrQuery.set(q, *:*);
solrQuery.addFilterQuery(-myfield:N/A);



Christian von Wendt-Jensen





On 07/01/2012 1:32 PM, Darren Govoni dar...@ontrenet.com wrote:

I don't think it comes at any added cost for solr to return that facet
so you can filter it
out in your business logic.

On Sat, 2012-07-07 at 15:18 +0530, Shanu Jha wrote:

 Hi,
 
 
 I am generating facet for a field which has one of the value NA and I
 want solr should not create facet(or ignore) for this(NA) value. Is
there
 any way to in solr to do that.
 
 Thanks





Re: Multi-thread UpdateProcessor

2012-07-08 Thread Mikhail Khludnev
some benchmark added. pls check jira

On Fri, Jul 6, 2012 at 11:13 PM, Dmitry Kan dmitry@gmail.com wrote:

 Mikhail,

 you have my +1 and a jira comment :)

 // Dmitry

 On Fri, Jul 6, 2012 at 7:41 PM, Mikhail Khludnev 
 mkhlud...@griddynamics.com
  wrote:

  Okay, why do you think this idea is not worth to look at?
 
  On Fri, Jul 6, 2012 at 12:53 AM, Mikhail Khludnev 
  mkhlud...@griddynamics.com wrote:
 
   Hello,
  
   Most times when single thread streaming
   http://wiki.apache.org/solr/Solrj#Streaming_documents_for_an_update is
   used I saw lack of cpu utilization at Solr server. Resonable motivation
  is
   utilize more threads to index faster, but it requires more complicated
client side.
   I propose to employ special update processor which can fork the stream
   processing onto many threads. If you like it pls vote for
   https://issues.apache.org/jira/browse/SOLR-3585 .
  
   Regards
  
   --
   Sincerely yours
   Mikhail Khludnev
   Tech Lead
   Grid Dynamics
  
   http://www.griddynamics.com
mkhlud...@griddynamics.com
  
  
 
 
  --
  Sincerely yours
  Mikhail Khludnev
  Tech Lead
  Grid Dynamics
 
  http://www.griddynamics.com
   mkhlud...@griddynamics.com
 



 --
 Regards,

 Dmitry Kan




-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: MoreLikeThis and mlt.count

2012-07-08 Thread Lee Carroll
Hi Bruno

I'm not sure if that makes sense for a query which does not have a boolean
element to it. What is your use-case



On 7 July 2012 18:36, Bruno Mannina bmann...@free.fr wrote:

 Dear Solr users,

 I have a field name fid defined as:
 field name=fid type=string indexed=true stored=true
 required=true termVectors=true/

 This fid can have a value like:
 a0001
 b57855
 3254
 etc...
 (length 20 digits)

 I would like to get *all* docs that result returns. Actually by default
 mlt.count is set to 5 but I don't want to
 set it to 200 in my url to be sure to get all results in the same xml.

 Is there a way to set mlt.count to get always *all* mlt documents ?

 I read 
 http://wiki.apache.org/solr/**MoreLikeThishttp://wiki.apache.org/solr/MoreLikeThiswithout
  find a solution



 Sincerely,
 Bruno
 Solr 3.6
 Ubuntu



Re: Getting only one result by family?

2012-07-08 Thread Lee Carroll
Hi Bruno,

As described See  http://wiki.apache.org/solr/FieldCollapsing but also
faceting as this often fits the bill

On 7 July 2012 22:27, Bruno Mannina bmann...@free.fr wrote:

 Dear Solr users,

 I have a field named FID for Family-ID:
 field name=fid type=string indexed=true stored=true
 required=true termVectors=true/

 My uniqueKey is the field PN and I have several others fields (text-en,
 string, general text, etc...).

 When I do a request on my index, like:
 title:airplane

 I get several docs but some docs are from the same family members (FID are
 equals)
 Example:
 Doc1
 fid=A0123
 Doc2
 fid=B777
 Doc3
 fid=C008
 ...
 Doc175  = same family Doc1
 fid=A0123
 ...

 Is it possible to get only docs with FID differents?
 I don't want to see Doc175 on my XML result.
 By this way if I set rows=20 I will have 20 docs from 20 different
 families.

 Thanks for your help,
 Bruno
 Solr3.6
 Ubuntu



Re: MoreLikeThis and mlt.count

2012-07-08 Thread Bruno Mannina

Hi,

My docs are patents. Patents have family members and I would like to get 
docs by PN (field Patent Number (uniquekey)).


My request will be
?q=pn:EP100A1mlt=true.

with this method I will get all equivalents (family members of EP100A1)

If set automaticaly mlt.count to MAX is not possible, so I will set to 
500


Le 08/07/2012 11:17, Lee Carroll a écrit :

Hi Bruno

I'm not sure if that makes sense for a query which does not have a boolean
element to it. What is your use-case



On 7 July 2012 18:36, Bruno Mannina bmann...@free.fr wrote:


Dear Solr users,

I have a field name fid defined as:
field name=fid type=string indexed=true stored=true
required=true termVectors=true/

This fid can have a value like:
a0001
b57855
3254
etc...
(length 20 digits)

I would like to get *all* docs that result returns. Actually by default
mlt.count is set to 5 but I don't want to
set it to 200 in my url to be sure to get all results in the same xml.

Is there a way to set mlt.count to get always *all* mlt documents ?

I read 
http://wiki.apache.org/solr/**MoreLikeThishttp://wiki.apache.org/solr/MoreLikeThiswithout
 find a solution



Sincerely,
Bruno
Solr 3.6
Ubuntu






Re: Getting only one result by family?

2012-07-08 Thread Bruno Mannina

Hi Lee,

I tried group to my FID field and outch error 500 + outofmemory...

I don't yet tested facets

Thanks,
Bruno

Le 08/07/2012 11:19, Lee Carroll a écrit :

Hi Bruno,

As described See  http://wiki.apache.org/solr/FieldCollapsing but also
faceting as this often fits the bill

On 7 July 2012 22:27, Bruno Mannina bmann...@free.fr wrote:


Dear Solr users,

I have a field named FID for Family-ID:
field name=fid type=string indexed=true stored=true
required=true termVectors=true/

My uniqueKey is the field PN and I have several others fields (text-en,
string, general text, etc...).

When I do a request on my index, like:
title:airplane

I get several docs but some docs are from the same family members (FID are
equals)
Example:
Doc1
fid=A0123
Doc2
fid=B777
Doc3
fid=C008
...
Doc175  = same family Doc1
fid=A0123
...

Is it possible to get only docs with FID differents?
I don't want to see Doc175 on my XML result.
By this way if I set rows=20 I will have 20 docs from 20 different
families.

Thanks for your help,
Bruno
Solr3.6
Ubuntu






Re: Getting only one result by family?

2012-07-08 Thread Lee Carroll
see  http://wiki.apache.org/solr/SolrPerformanceFactors#OutOfMemoryErrors

On 8 July 2012 12:37, Bruno Mannina bmann...@free.fr wrote:

 Hi Lee,

 I tried group to my FID field and outch error 500 + outofmemory...

 I don't yet tested facets

 Thanks,
 Bruno

 Le 08/07/2012 11:19, Lee Carroll a écrit :

  Hi Bruno,

 As described See  
 http://wiki.apache.org/solr/**FieldCollapsinghttp://wiki.apache.org/solr/FieldCollapsingbut
  also
 faceting as this often fits the bill

 On 7 July 2012 22:27, Bruno Mannina bmann...@free.fr wrote:

  Dear Solr users,

 I have a field named FID for Family-ID:
 field name=fid type=string indexed=true stored=true
 required=true termVectors=true/

 My uniqueKey is the field PN and I have several others fields (text-en,
 string, general text, etc...).

 When I do a request on my index, like:
 title:airplane

 I get several docs but some docs are from the same family members (FID
 are
 equals)
 Example:
 Doc1
 fid=A0123
 Doc2
 fid=B777
 Doc3
 fid=C008
 ...
 Doc175  = same family Doc1
 fid=A0123
 ...

 Is it possible to get only docs with FID differents?
 I don't want to see Doc175 on my XML result.
 By this way if I set rows=20 I will have 20 docs from 20 different
 families.

 Thanks for your help,
 Bruno
 Solr3.6
 Ubuntu






Re: solr facet fields doesn't honor fq

2012-07-08 Thread Erick Erickson
Solr faceting only counts documents that satisfy the query. Think of it
as assembling a list of all possible values for a field and then adding
1 for each value found in each document that satisfies the overall
query (including the filter query). So you can get counts of 0, that's
expected. Adding mincount=1 will keep these from being returned.

I suspect that your query is not finding the documents you think it is or your
filter query is not parsed as you expect. If you add debugQuery=on you'll
see the parsed form of both. In particular, look for your complex fq to be
broken up and distributed with some parts against your portal_uuid and
some against the default search field. In particular, '+' and '-' are
operators and the top-level parsers may be splitting these up. Quoting or
parenthesizing may help.

Best
Erick

On Sun, Jul 8, 2012 at 2:32 AM, Chamnap Chhorn chamnapchh...@gmail.com wrote:
 Hi all,

 I have a question related to solr 3.5 on field facet. Here is my query:

 http://localhost:8081/solr_new/select?tie=0.1q.alt=*:*q=bankqf=nameaddressfq=
 *portal_uuid:+A4E7890F-A188-4663-89EB-176D94DF6774*defType=dismax*
 facet=true*facet.field=*location_uuid*facet.field=*sub_category_uuids*

 What I get back with field facet are:
 1. Some location_uuids which is in the current portal_uuid (has facet count
 0)
 2. Some location_uuids are not in the current portal_uuid at all (has facet
 count = 0)

 It seems that solr doesn't honor the fq at all when returning field facet.
 I need to add one more parameter facet.mincount=1 in order to not return
 location_uuids facet (2).

 I think, solr does faceting on all location_uuid. It should does that
 scoping to current portal_uuid. Any idea?

 --
 Chhorn Chamnap
 http://chamnap.github.com/


SolrCloud error while propagating update to primary ZK node

2012-07-08 Thread avenka
I get a JSON parse error (pasted below) when I send an update to a replica
node. I downloaded solr 4 alpha and followed the instructions at
http://wiki.apache.org/solr/SolrCloud/ and setup numShards=1 with 3 total
servers managed by a zookeeper ensemble, the primary at 8983 and the other
two at 7574 and 8900 respectively. 

The error below shows up in the primary's log when I try to add a document
to either replica. The document add fails. I am able to successfully add
documents by directly sending to the primary. How do I correctly add
documents to replicas?

SEVERE: org.apache.noggit.JSONParser$ParseException: JSON Parse Error:
char=,position=0 BEFORE='' AFTER='adddoc boost=1.0field name=id2'
at org.apache.noggit.JSONParser.err(JSONParser.java:221)
at org.apache.noggit.JSONParser.next(JSONParser.java:620)
at org.apache.noggit.JSONParser.nextEvent(JSONParser.java:661)
at
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:105)
at
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:95)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:59)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:240)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1561)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
... [snip]



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-error-while-propagating-update-to-primary-ZK-node-tp3993760.html
Sent from the Solr - User mailing list archive at Nabble.com.


SolrCloud replication question

2012-07-08 Thread avenka
I am trying to wrap my head around replication in SolrCloud. I tried the
setup at http://wiki.apache.org/solr/SolrCloud/. I mainly need replication
for high query throughput. The setup at the URL above appears to maintain
just one copy of the index at the primary node (instead of a replicated
index as in a master/slave configuration). Will I still get roughly an
n-fold increase in query throughput with n replicas? And if so, why would
one do master/slave replication with multiple copies of the index at all?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-replication-question-tp3993761.html
Sent from the Solr - User mailing list archive at Nabble.com.


DataImport using last_indexed_id or getting max(id) quickly

2012-07-08 Thread avenka
My understanding is that the DIH in solr only enters last_indexed_time in
dataimport.properties, but not say last_indexed_id for a primary key 'id'.
How can I efficiently get the max(id) (note that 'id' is an auto-increment
field in the database) ? Maintaining max(id) outside of solr is brittle and
calling max(id) before each dataimport can take several minutes when the
index has several hundred million records.

How can I either import based on ID or get max(id) quickly? I can not use
timestamp-based import because I get out-of-memory errors if/when solr falls
behind and the suggested fixes online did not work for me. 

--
View this message in context: 
http://lucene.472066.n3.nabble.com/DataImport-using-last-indexed-id-or-getting-max-id-quickly-tp3993763.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Regression of JIRA 1826?

2012-07-08 Thread Jamie Johnson
Is there any more information that folks need to dig into this?  I
have been unable to this point to figure out what specifically it is
happening, so would appreciate any help.

On Fri, Jul 6, 2012 at 2:13 PM, Jamie Johnson jej2...@gmail.com wrote:
 A little more information on this.

 I tinkered a bit with the schema and it appears to be related to
 WordDelimiterFilterFactory and splitOnCaseChange being true, or at
 least this setting being set exhibits the issue.

 Also I am using the edismax query parser.  Again any ideas/help would
 be greatly appreciated.

 On Fri, Jul 6, 2012 at 1:40 AM, Jamie Johnson jej2...@gmail.com wrote:
 I just upgraded to trunk to try to fix an issue I was having with the
 highlighter described in JIRA 1826, but it appears that this issue
 still exists on trunk.  I'm running the following query

 subject:ztest*

 subject is a text field (not multivalued) and the return in highlighting is

 emZTest/emForemZTestForJamie/em

 the actual stored value is ZTestForJamie.  Is anyone else experiencing 
 this?


Top 5 high freq words - UpdateProcessorChain or DIH Script?

2012-07-08 Thread Pranav Prakash
Hi,

I want to store top 5 high frequency non-stopwords words. I use DIH to
import data. Now I have two approaches -

   1. Use DIH JavaScript to find top 5 frequency words and put them in a
   copy field. The copy field will then stem it and remove stop words based on
   appropriate tokenizers.
   2. Write a custom function for the same and add it to
   UpdateRequestProcessor Chain.

Which of the two would be better suited? I find the first approach rather
simple, but the issue is that I won't be having access to stop
words/synonyms etc at the DIH time.

In the second approach, if I add it to UpdateRequestProcessor Chain and
insert the function after StopWordsFilterFactory and
DuplicateRemoveFilterFactory, should be rather good way of doing this?

--
*Pranav Prakash*

temet nosce


Re: SolrCloud error while propagating update to primary ZK node

2012-07-08 Thread Jack Krupansky
In theory, with SolrCloiud you can add to any replica and the change gets 
propagated automatically to all of the other replicas for that shard. In 
theory.


The stack trace message suggests that Solr is trying to parse your input as 
JSON when in fact your input is XML. I vaguely recall that Yonik was working 
on update and had implemented something with JSON, but I don't recall that 
XML was also implemented. (or maybe the work was done in trunk but not 
backported to 4x - I just don't recall exactly.) For now, it sounds as if 
you have to have to send updates to the primary node of the shard and then 
let Solr replicate it. I'll defer to the Cloud experts on the details.


-- Jack Krupansky

-Original Message- 
From: avenka

Sent: Sunday, July 08, 2012 11:52 AM
To: solr-user@lucene.apache.org
Subject: SolrCloud error while propagating update to primary ZK node

I get a JSON parse error (pasted below) when I send an update to a replica
node. I downloaded solr 4 alpha and followed the instructions at
http://wiki.apache.org/solr/SolrCloud/ and setup numShards=1 with 3 total
servers managed by a zookeeper ensemble, the primary at 8983 and the other
two at 7574 and 8900 respectively.

The error below shows up in the primary's log when I try to add a document
to either replica. The document add fails. I am able to successfully add
documents by directly sending to the primary. How do I correctly add
documents to replicas?

SEVERE: org.apache.noggit.JSONParser$ParseException: JSON Parse Error:
char=,position=0 BEFORE='' AFTER='adddoc boost=1.0field name=id2'
at org.apache.noggit.JSONParser.err(JSONParser.java:221)
at org.apache.noggit.JSONParser.next(JSONParser.java:620)
at org.apache.noggit.JSONParser.nextEvent(JSONParser.java:661)
at
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:105)
at
org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:95)
at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:59)
at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:240)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1561)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
... [snip]



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-error-while-propagating-update-to-primary-ZK-node-tp3993760.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: SolrCloud error while propagating update to primary ZK node

2012-07-08 Thread Mark Miller
Can you show us exactly how you are adding the document?

Eg, what update handler are you using, and what is the document you are adding?

On Jul 8, 2012, at 12:52 PM, avenka wrote:

 I get a JSON parse error (pasted below) when I send an update to a replica
 node. I downloaded solr 4 alpha and followed the instructions at
 http://wiki.apache.org/solr/SolrCloud/ and setup numShards=1 with 3 total
 servers managed by a zookeeper ensemble, the primary at 8983 and the other
 two at 7574 and 8900 respectively. 
 
 The error below shows up in the primary's log when I try to add a document
 to either replica. The document add fails. I am able to successfully add
 documents by directly sending to the primary. How do I correctly add
 documents to replicas?
 
 SEVERE: org.apache.noggit.JSONParser$ParseException: JSON Parse Error:
 char=,position=0 BEFORE='' AFTER='adddoc boost=1.0field name=id2'
   at org.apache.noggit.JSONParser.err(JSONParser.java:221)
   at org.apache.noggit.JSONParser.next(JSONParser.java:620)
   at org.apache.noggit.JSONParser.nextEvent(JSONParser.java:661)
   at
 org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:105)
   at
 org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:95)
   at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:59)
   at
 org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
   at
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
   at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
   at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:240)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1561)
   at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
   at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
   at
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
 ... [snip]
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/SolrCloud-error-while-propagating-update-to-primary-ZK-node-tp3993760.html
 Sent from the Solr - User mailing list archive at Nabble.com.

- Mark Miller
lucidimagination.com













Re: SolrCloud error while propagating update to primary ZK node

2012-07-08 Thread avenka
I tried adding in two ways with the same outcome: (1) using solrj to call 
HttpSolrServer.add(docList) using BinaryRequestWriter; (2) using 
DataImportHandler to import directly from a database through a 
db-data-config.xml file.

The document I'm adding has a long primary key id field and a few other string 
and timestamp fields. I also added a long _version_ field coz the URL said so. 
I've been using this schema without problems with 3.6 for a while and it works 
fine when added to the primary in 4.0.

Mark Miller-3 [via Lucene] ml-node+s472066n3993780...@n3.nabble.com wrote:

Can you show us exactly how you are adding the document? 

Eg, what update handler are you using, and what is the document you are adding? 

On Jul 8, 2012, at 12:52 PM, avenka wrote: 


 I get a JSON parse error (pasted below) when I send an update to a replica 
 node. I downloaded solr 4 alpha and followed the instructions at 
 http://wiki.apache.org/solr/SolrCloud/ and setup numShards=1 with 3 total 
 servers managed by a zookeeper ensemble, the primary at 8983 and the other 
 two at 7574 and 8900 respectively. 
 
 The error below shows up in the primary's log when I try to add a document 
 to either replica. The document add fails. I am able to successfully add 
 documents by directly sending to the primary. How do I correctly add 
 documents to replicas? 
 
 SEVERE: org.apache.noggit.JSONParser$ParseException: JSON Parse Error: 
 char=,position=0 BEFORE='' AFTER='adddoc boost=1.0field name=id2' 
   at org.apache.noggit.JSONParser.err(JSONParser.java:221) 
   at org.apache.noggit.JSONParser.next(JSONParser.java:620) 
   at org.apache.noggit.JSONParser.nextEvent(JSONParser.java:661) 
   at 
 org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:105)
  
   at 
 org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:95)
  
   at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:59) 
   at 
 org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
  
   at 
 org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
  
   at 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
  
   at 
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:240)
  
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1561) 
   at 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442)
  
   at 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263)
  
   at 
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337)
  
 ... [snip] 
 
 
 
 -- 
 View this message in context: 
 http://lucene.472066.n3.nabble.com/SolrCloud-error-while-propagating-update-to-primary-ZK-node-tp3993760.html
 Sent from the Solr - User mailing list archive at Nabble.com. 


- Mark Miller 
lucidimagination.com 













_




If you reply to this email, your message will be added to the discussion below:


http://lucene.472066.n3.nabble.com/SolrCloud-error-while-propagating-update-to-primary-ZK-node-tp3993760p3993780.html
   



To unsubscribe from SolrCloud error while 
propagating update to primary ZK node, click here.
NAML



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-error-while-propagating-update-to-primary-ZK-node-tp3993760p3993781.html
Sent from the Solr - User mailing list archive at Nabble.com.

Deployment with LUCENE-2899 patch can't load solr.OpenNLPTokenizerFactory

2012-07-08 Thread Sean Glover
Hi,

Platform: ubuntu 12.04
Package: apache-solr-4.0-2012-07-07_11-55-05-src.tgz
Web: Apache Tomcat/7.0.26

I'm trying to use the LUCENE-2899 patch
(https://issues.apache.org/jira/browse/LUCENE-2899).  As an end-user I
believe this is the correct list to post to.

I'm new to Solr, so I started by successfully deploying the
solr/example project to tomcat7.

To deploy a 2nd instance of solr that uses the OpenNLP configuration I
performed the following steps as per the OpenNLP wiki page
(http://wiki.apache.org/solr/OpenNLP).

ant compile
cd solr/contrib/opennlp/src/test-files/training
run 'bin/trainall.sh'
run ant-testcontrib

All these tasks executed successfully.  Then I attempted to actually
deploy my Solr w/ OpenNLP instance with the following steps.

Downloaded real models from http://opennlp.sourceforge.net/models-1.5/
(except for the content of coref, do I need to get this?)
copied my solr-example deployment to create a solr-nlp deployment
copied opennlp config to my deployment config
 source: solr/contrib/opennlp/src/test-files/opennlp/solr/collection1/conf
 dest: /var/tomcat/solr/nlp/solr/collection1/conf
copied opennlp libs to my deployment libs
- source: solr/contrib/opennlp/lib
- dest: /var/tomcat/solr/nlp/solr/collection1/lib
updated my deployed solrconfig.xml
- set dataDir: 
dataDir${solr.data.dir:/var/tomcat/solr/nlp/solr/collection1/data}/dataDir
 - an absolute path was recommended by the Tomcat7 Solr deployment guide
added a lib: lib dir=/var/tomcat/solr/nlp/solr/collection1/lib
regex=.*\.jar/
- again, i specified an absolute path so tomcat would know exactly
where to load the opennlp libs from

When I attempt to hit my NLP instance of solr I get the following
error (http://localhost:8080/solr-nlp/admin)

This interface requires that you activate the admin request handlers,
add the following configuration to your solrconfig.xml:

However, I have the admin requestHandler defined exactly as requested.
 Is this a catch-all error of some sort?

When I dig a little deeper and look at the Catalina logs I found an
error and stack trace.

SEVERE: null:org.apache.solr.common.SolrException: Plugin init failure
for [schema.xml] fieldType text_opennlp: Plugin init failure for
[schema.xml] analyzer/tokenizer: Error loading class
'solr.OpenNLPTokenizerFactory'
at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:168)
at org.apache.solr.schema.IndexSchema.readSchema(IndexSchema.java:364)
at org.apache.solr.schema.IndexSchema.init(IndexSchema.java:111)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:816)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:514)
at org.apache.solr.core.CoreContainer.load(CoreContainer.java:335)
at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:284)
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:106)
at 
org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:277)
at 
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:258)
at 
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:382)
at 
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:103)
at 
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4638)
at 
org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5294)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
at 
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:895)
at 
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:871)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:615)
at 
org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:649)
at 
org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1581)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.solr.common.SolrException: Plugin init failure
for [schema.xml] analyzer/tokenizer: Error loading class
'solr.OpenNLPTokenizerFactory'
at 
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:168)
at 
org.apache.solr.schema.FieldTypePluginLoader.readAnalyzer(FieldTypePluginLoader.java:322)
at 
org.apache.solr.schema.FieldTypePluginLoader.create(FieldTypePluginLoader.java:95)

PostCommit Document

2012-07-08 Thread Dewi Wahyuni
Hi All,

I would like to know how to use postCommit in SOLR properly. I would like to 
grab the indexed document and do further processing with it.  How do I capture 
the documents being committed to the SOLR through the arguments in the 
postCommit  config? I'm not using SolrJ and have no intention in using Java at 
the moment.
If this is not possible, please let me know.

Thank you

Dewi




Re: Regression of JIRA 1826?

2012-07-08 Thread Lance Norskog
Please post a trimmed-down version of your schema.xml and a sample document.

On Sun, Jul 8, 2012 at 11:54 AM, Jamie Johnson jej2...@gmail.com wrote:
 Is there any more information that folks need to dig into this?  I
 have been unable to this point to figure out what specifically it is
 happening, so would appreciate any help.

 On Fri, Jul 6, 2012 at 2:13 PM, Jamie Johnson jej2...@gmail.com wrote:
 A little more information on this.

 I tinkered a bit with the schema and it appears to be related to
 WordDelimiterFilterFactory and splitOnCaseChange being true, or at
 least this setting being set exhibits the issue.

 Also I am using the edismax query parser.  Again any ideas/help would
 be greatly appreciated.

 On Fri, Jul 6, 2012 at 1:40 AM, Jamie Johnson jej2...@gmail.com wrote:
 I just upgraded to trunk to try to fix an issue I was having with the
 highlighter described in JIRA 1826, but it appears that this issue
 still exists on trunk.  I'm running the following query

 subject:ztest*

 subject is a text field (not multivalued) and the return in highlighting is

 emZTest/emForemZTestForJamie/em

 the actual stored value is ZTestForJamie.  Is anyone else experiencing 
 this?



-- 
Lance Norskog
goks...@gmail.com


Re: Indexing Wikipedia

2012-07-08 Thread vineet yadav
Hi,
I would recommend indexing wikipedia xml dump. Check out dataimport
hander example of indexing
wikipedia(http://wiki.apache.org/solr/DataImportHandler#Example%3a_Indexing_wikipedia).
Thanks
Vineet Yadav

On Sun, Jul 8, 2012 at 9:15 AM, kiran kumar kirankumarsm...@gmail.com wrote:
 Hi,
 In our office we have wikipedia setup for intranet. I want to index the
 wikipedia, I have been recently studying that all the wiki pages are stored
 in database and the schema is a bit of standard followed from mediawiki. I
 am also thinking of whether to use xmldumper to dump all the wiki pages
 into xml and index from there.
 Have anybody done something like this. If so, which way is more efficient
 and easy to implement.
 For me the DB schema look quite a bit complicated. Can somebody please help
 me in understanding what is the better implementation for this.

 Thanks,
 Kiran Bushireddy.


RE: Better (and valid) Spellcheck in combination with other parameters with at least one occurance

2012-07-08 Thread ninaddesai82
Thanks James for your reply.

I am using spell check collation options (except
spellcheck.maxCollationTries). 
However, Will spellcheck.maxCollationTries consider other parqameneters in
query or just all spellcheck words in q ?

Becuse in my case, if original query is --
solr/search/?q=hangryc=CA(all suggestion params)

then what i want is luscene suggestion to return is 
if q=hungry has hits with param c=CA then return suggestion hungry
if q=angry has hits with param c=CA then return suggestion angry

so, does maxCollationTries consider other parameters while collating the
results ?

- Ninad

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Better-and-valid-Spellcheck-in-combination-with-other-parameters-with-at-least-one-occurance-tp3993484p3993816.html
Sent from the Solr - User mailing list archive at Nabble.com.