Error while adding copy fields to a schema

2014-09-03 Thread Hokam Singh Chauhan
Hi,

I have an requirement in which I have to add some fields in schema at run
time and after that i need to add the copy fields for some of the schema
fields.

To add the fields in schema I used the following REST API, which is giving
success response in output as shown below:

*Post URL: *http://localhost:8080/solr/bookindex/schema/fields
*Content-type :* application/json
*Post Data :*
[
  {
indexed: true,
name: age,
stored: true,
type: long
  },
  {
indexed: true,
name: sex,
stored: true,
type: string
  },
  {
indexed: true,
name: _all,
stored: true,
type: string,
multiValued : true
  }
]

*Output Response :*
{
  responseHeader:{
status:0,
QTime:202
  }
}

After adding these fields in schema, as I executing the second call to add
the copy fields in schema, i am getting an error *Error persisting managed
schema at /configs/myconf/managed-schema *in response.

Following are the details about REST API which i am using to add the copy
fields along with error response.


*Post URL: *http://localhost:7070/solr/bookindex/schema/copyfields
*Content-type : *application/json
*Post Data : *
[
{
source:age,
dest: _all
},
{
source:sex,
dest: _all
}
]
*Output Response :*
{
  responseHeader:{
status:500,
QTime:190},
  error:{
msg:Error persisting managed schema at
/configs/myconf/managed-schema,
trace:org.apache.solr.common.SolrException: Error persisting managed
schema at /configs/myconf/managed-schema\n\tat
org.apache.solr.schema.ManagedIndexSchema.persistManagedSchemaToZooKeeper(ManagedIndexSchema.java:166)\n\tat
org.apache.solr.schema.ManagedIndexSchema.persistManagedSchema(ManagedIndexSchema.java:83)\n\tat
org.apache.solr.schema.ManagedIndexSchema.addCopyFields(ManagedIndexSchema.java:281)\n\tat
org.apache.solr.rest.schema.CopyFieldCollectionResource.post(CopyFieldCollectionResource.java:174)\n\tat
org.restlet.resource.ServerResource.doHandle(ServerResource.java:437)\n\tat
org.restlet.resource.ServerResource.doConditionalHandle(ServerResource.java:350)\n\tat
org.restlet.resource.ServerResource.handle(ServerResource.java:952)\n\tat
org.restlet.resource.Finder.handle(Finder.java:246)\n\tat
org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat
org.restlet.routing.Filter.handle(Filter.java:206)\n\tat
org.restlet.routing.Router.doHandle(Router.java:431)\n\tat
org.restlet.routing.Router.handle(Router.java:648)\n\tat
org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat
org.restlet.routing.Filter.handle(Filter.java:206)\n\tat
org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat
org.restlet.routing.Filter.handle(Filter.java:206)\n\tat
org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat
org.restlet.engine.application.StatusFilter.doHandle(StatusFilter.java:155)\n\tat
org.restlet.routing.Filter.handle(Filter.java:206)\n\tat
org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat
org.restlet.routing.Filter.handle(Filter.java:206)\n\tat
org.restlet.engine.CompositeHelper.handle(CompositeHelper.java:211)\n\tat
org.restlet.engine.application.ApplicationHelper.handle(ApplicationHelper.java:84)\n\tat
org.restlet.Application.handle(Application.java:381)\n\tat
org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat
org.restlet.routing.Filter.handle(Filter.java:206)\n\tat
org.restlet.routing.Router.doHandle(Router.java:431)\n\tat
org.restlet.routing.Router.handle(Router.java:648)\n\tat
org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat
org.restlet.routing.Filter.handle(Filter.java:206)\n\tat
org.restlet.routing.Router.doHandle(Router.java:431)\n\tat
org.restlet.routing.Router.handle(Router.java:648)\n\tat
org.restlet.routing.Filter.doHandle(Filter.java:159)\n\tat
org.restlet.routing.Filter.handle(Filter.java:206)\n\tat
org.restlet.engine.CompositeHelper.handle(CompositeHelper.java:211)\n\tat
org.restlet.Component.handle(Component.java:392)\n\tat
org.restlet.Server.handle(Server.java:516)\n\tat
org.restlet.engine.ServerHelper.handle(ServerHelper.java:72)\n\tat
org.restlet.engine.adapter.HttpServerHelper.handle(HttpServerHelper.java:152)\n\tat
org.restlet.ext.servlet.ServerServlet.service(ServerServlet.java:1089)\n\tat
javax.servlet.http.HttpServlet.service(HttpServlet.java:848)\n\tat
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:669)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:457)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:575)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)\n\tat

Re: Schema API synchronization question

2014-09-03 Thread Matthias Broecheler
Yes, that is what we are seeing. Thanks for pointing me to the right issues
to track.
Where can I find out when 4.10 final is going to be released?

Thanks,
Matthias


On Sat, Aug 30, 2014 at 9:26 PM, Erick Erickson erickerick...@gmail.com
wrote:

 There have been some recent improvements in that area, what version of Solr
 are you running? Is there any chance you could try with 4.10 when the final
 version is released? Or perhaps checkout/build the 4.10 release candidate?

 See, for instance, https://issues.apache.org/jira/browse/SOLR-6137

 Still open: https://issues.apache.org/jira/browse/SOLR-6249

 Do either of these describe what you are seeing?

 If not, how exactly are things going wonky?

 Best,
 Erick


 On Sat, Aug 30, 2014 at 7:02 PM, Matthias Broecheler m...@matthiasb.com
 wrote:

  Hello everybody,
 
  from reading the documentation it is not entirely clear what the
  synchronization behavior of Solr's schema API is. We are seeing some
  reliability issues in a multi-machine SolrCloud setup. Granted, being new
  we might be doing something wrong, but at this point I am confused as to
  what the expected behavior ought to be.
 
  It would be wonderful if somebody could point me to or explain how schema
  changes made through the API are propagated in a cluster, what happens if
  documents are added concurrently and any known issues that might exist in
  that regard.
 
  Thank you very much,
  Matthias
 
  --
  Matthias Broecheler
  http://www.matthiasb.com
 




-- 
Matthias Broecheler
http://www.matthiasb.com


Re: Schema API synchronization question

2014-09-03 Thread Steve Rowe
The release vote has passed, the release packages are spreading out to the 
mirrors, and the announcement should appear in the next 12-24 hours.

Steve
www.lucidworks.com

On Sep 2, 2014, at 11:56 PM, Matthias Broecheler m...@matthiasb.com wrote:

 Yes, that is what we are seeing. Thanks for pointing me to the right issues
 to track.
 Where can I find out when 4.10 final is going to be released?
 
 Thanks,
 Matthias
 
 
 On Sat, Aug 30, 2014 at 9:26 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
 There have been some recent improvements in that area, what version of Solr
 are you running? Is there any chance you could try with 4.10 when the final
 version is released? Or perhaps checkout/build the 4.10 release candidate?
 
 See, for instance, https://issues.apache.org/jira/browse/SOLR-6137
 
 Still open: https://issues.apache.org/jira/browse/SOLR-6249
 
 Do either of these describe what you are seeing?
 
 If not, how exactly are things going wonky?
 
 Best,
 Erick
 
 
 On Sat, Aug 30, 2014 at 7:02 PM, Matthias Broecheler m...@matthiasb.com
 wrote:
 
 Hello everybody,
 
 from reading the documentation it is not entirely clear what the
 synchronization behavior of Solr's schema API is. We are seeing some
 reliability issues in a multi-machine SolrCloud setup. Granted, being new
 we might be doing something wrong, but at this point I am confused as to
 what the expected behavior ought to be.
 
 It would be wonderful if somebody could point me to or explain how schema
 changes made through the API are propagated in a cluster, what happens if
 documents are added concurrently and any known issues that might exist in
 that regard.
 
 Thank you very much,
 Matthias
 
 --
 Matthias Broecheler
 http://www.matthiasb.com
 
 
 
 
 
 -- 
 Matthias Broecheler
 http://www.matthiasb.com



looking for a solr/search expert in Paris

2014-09-03 Thread elisabeth benoit
Hello,


We are looking for a solr consultant to help us with our devs using solr.
We've been working on this for a little while, and we feel we need an
expert point of view on what we're doing, who could give us insights about
our solr conf, performance issues, error handling issues (big thing). Well
everything.

The entreprise is in the Paris (France) area. Any suggestion is welcomed.

Thanks,
Elisabeth


AUTO: Saravanan Chinnadurai is out of the office (returning 04/09/2014)

2014-09-03 Thread Saravanan . Chinnadurai
I will be out of the office starting 03/09/2014 and will not return until
04/09/2014

 Please email itsta...@actionimages.com for any urgent queries.


Note: This is an automated response to your message  How can I set shard
members? sent on 9/3/2014 5:00:04.

This is the only notification you will receive while this person is away.


Action Images is a division of Reuters Limited and your data will therefore be 
protected
in accordance with the Reuters Group Privacy / Data Protection notice which is 
available
in the privacy footer at www.reuters.com
Registered in England No. 145516   VAT REG: 397000555


How to stop Solr delta import from creating a log file

2014-09-03 Thread madhav bahuguna
I have solr installed on Debian and every time delta import takes place a
file gets created in my root directory. The files that get created  look
like this


  dataimport?command=delta-import.1

  dataimport?command=delta-import.2

 .

 .

 .

  dataimport?command=delta-import.30

Every time there is a delta import a file gets created , i opened the file
in vi editor and its an xml file. Why are these files getting created and
how do i stop solr from creating them.


To start solr i use this command


  Java -jar start.jar 

According to this command no log files should be created. Please advise and
help iam new to solr.
-- 
Regards
Madhav Bahuguna


Create collection dynamically in my program

2014-09-03 Thread xinwu
Hi , all:
I created collection per day dynamically in my program.Like this:
 http://lucene.472066.n3.nabble.com/file/n4156601/create1.png 
But,when I searched data with collection=myCollection-20140903,it
showed Collection not found:myCollection-20140903 .
I checked the clusterState in debug mode , there was not
myCollection-20140903 in it.
But,there was myCollection-20140903 in zk clusterstate.json
actually.

Is there something wrong in my way?
If there is new way or better way to create collection dynamically?

Thanks!
-Xinwu





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Create-collection-dynamically-in-my-program-tp4156601.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Create collection dynamically in my program

2014-09-03 Thread Jürgen Wagner (DVT)
Hello Xinwu,
  does it change anything if you use an underline instead of the dash in
the collection name?

What is the result of the call? Any status or error message?

Did you actually feed data into the collection?

Cheers,
--Jürgen

On 03.09.2014 11:21, xinwu wrote:
 Hi , all:
 I created collection per day dynamically in my program.Like this:
  http://lucene.472066.n3.nabble.com/file/n4156601/create1.png 
 But,when I searched data with collection=myCollection-20140903,it
 showed Collection not found:myCollection-20140903 .
 I checked the clusterState in debug mode , there was not
 myCollection-20140903 in it.
 But,there was myCollection-20140903 in zk clusterstate.json
 actually.

 Is there something wrong in my way?
 If there is new way or better way to create collection dynamically?

 Thanks!
 -Xinwu





 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Create-collection-dynamically-in-my-program-tp4156601.html
 Sent from the Solr - User mailing list archive at Nabble.com.


-- 

Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С
уважением
*i.A. Jürgen Wagner*
Head of Competence Center Intelligence
 Senior Cloud Consultant

Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany
Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543
E-Mail: juergen.wag...@devoteam.com
mailto:juergen.wag...@devoteam.com, URL: www.devoteam.de
http://www.devoteam.de/


Managing Board: Jürgen Hatzipantelis (CEO)
Address of Record: 64331 Weiterstadt, Germany; Commercial Register:
Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071




Re: HTTPS for SolrCloud

2014-09-03 Thread Christopher Gross
Once I upgraded to 4.9.0, the solr.ssl.checkPeerName option was used, and I
was able to create a collection.

I'm still wondering if there is a good way to remove references to any
collections that didn't complete, but block a collection from being made
with the same name?

Thanks!

-- Chris


On Tue, Sep 2, 2014 at 2:30 PM, Christopher Gross cogr...@gmail.com wrote:

 Is the solr.ssl.checkPeerName option available in 4.8.1?  I have my
 Tomcat starting up with that as a -D option, but I'm getting an exception
 on validating the hostname w/ the cert...

 -- Chris


 On Tue, Sep 2, 2014 at 1:44 PM, Christopher Gross cogr...@gmail.com
 wrote:

 OK -- so I think my previous attempts were causing the problem.
 Since this is a dev environment (and is still empty), I just went ahead
 and wiped out the version-2 directories for the zookeeper nodes, reloaded
 my solr collections, then ran that command (zkcli.sh in the solr distro).
 That did work.  What is a reliable way to remove a file from Zookeeper?

 Now I just get this error when trying to create a collection:
 org.apache.solr.client.solrj.SolrServerException:IOException occured when
 talking to server at: https://server:8444

 This brings up another problem that I have -- if there's an error
 creating a collection, if I fix the issue and try to re-create the
 collection, I get something like this:

 str name=Operation createcollection caused
 exception:org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
 collection already exists: testcollection/str

 How do I go about cleaning those up?  The only reliable thing that I've
 found is to wipe out the zookeepers and start over.

 Thanks Hoss!




 -- Chris


 On Tue, Sep 2, 2014 at 1:08 PM, Chris Hostetter hossman_luc...@fucit.org
  wrote:


 : ./zkcli.sh -zkhost localhost:2181 -cmd put /clusterprops.json
 : '{urlScheme:https}'
 ...
 : Next I start Tomcat, I get this:
 : 482  [localhost-startStop-1] ERROR org.apache.solr.core.SolrCore  â
 : null:org.noggit.JSONParser$ParseException: JSON Parse Error:
 : char=',position=0 BEFORE=''' AFTER='{urlScheme:https}''

 I can't reproduce the erorr you are describing when i follow all the
 steps on the SSL doc page (using bash, and the outer single quotes, just
 like you)...


 https://cwiki.apache.org/confluence/display/solr/Enabling+SSL#EnablingSSL-SolrCloud


 Are you certain that you  your solr nodes are talking to the same
 zookeeper instance?

 (Because according to that error, there is a stray sigle-quote at the
 begining of the clusterprops.json file in the ZK server solr is
 talking to, and as you already confirmed there's no single quotes in the
 string you read back from the zk server you are talking to ... perhaps
 there are 2 zk instances setup somewhere and the one solr is using still
 has crufty data from before you got the quoting issue straightened out?)


 do you see log messages early on in Solr's startup from ZkContainer that
 say...

 1359 [main] INFO  org.apache.solr.core.ZkContainer  – Zookeeper
 client=localhost:2181

 ?
 -Hoss
 http://www.lucidworks.com/






Re: looking for a solr/search expert in Paris

2014-09-03 Thread Jack Krupansky
Don't forget to check out the Solr Support wiki where consultants advertise 
their services:

http://wiki.apache.org/solr/Support

And any Solr or Lucene consultants on this mailing list should be sure that 
they are registered on that support wiki. Hey, it's free! And be sure to 
keep your listing up to date, including regional availability and any 
specialties.


-- Jack Krupansky

-Original Message- 
From: elisabeth benoit

Sent: Wednesday, September 3, 2014 4:02 AM
To: solr-user@lucene.apache.org
Subject: looking for a solr/search expert in Paris

Hello,


We are looking for a solr consultant to help us with our devs using solr.
We've been working on this for a little while, and we feel we need an
expert point of view on what we're doing, who could give us insights about
our solr conf, performance issues, error handling issues (big thing). Well
everything.

The entreprise is in the Paris (France) area. Any suggestion is welcomed.

Thanks,
Elisabeth 



Re: How to stop Solr delta import from creating a log file

2014-09-03 Thread Shawn Heisey
On 9/3/2014 3:19 AM, madhav bahuguna wrote:
 I have solr installed on Debian and every time delta import takes place a
 file gets created in my root directory. The files that get created  look
 like this

I figure there's one of two possibilities:

1) You've got a misconfiguration in the dataimport handler.

2) Solr has a bug that doesn't show up for most people, because most
people don't run Solr with full root/administrator privileges.  On Linux
systems only root typically has write privileges on the root directory.

You'll need to share your configs to see if there's anything obviously
wrong in them.  We'll also need to know which version you're on.

Thanks,
Shawn



Re: How to stop Solr delta import from creating a log file

2014-09-03 Thread Alexandre Rafalovitch
Is ' dataimport?command=delta-import.1' actually a file name? If this
the case, are you running the trigger from a cron job or similar? If I
am still on the right track, check your cron job/script and see if you
have misplaced new line, quote (e.g. MSWord quote instead of normal)
or some other abnormality. It looks like a Bobby Tables situation with
run away quotes.

Regards,
   Alex.
P.s. https://xkcd.com/327/
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On Wed, Sep 3, 2014 at 5:19 AM, madhav bahuguna
madhav.bahug...@gmail.com wrote:
 I have solr installed on Debian and every time delta import takes place a
 file gets created in my root directory. The files that get created  look
 like this


   dataimport?command=delta-import.1

   dataimport?command=delta-import.2

  .

  .

  .

   dataimport?command=delta-import.30

 Every time there is a delta import a file gets created , i opened the file
 in vi editor and its an xml file. Why are these files getting created and
 how do i stop solr from creating them.


 To start solr i use this command


   Java -jar start.jar 

 According to this command no log files should be created. Please advise and
 help iam new to solr.
 --
 Regards
 Madhav Bahuguna


Re: WordDelimiter filter, expanding to multiple words, unexpected results

2014-09-03 Thread Jonathan Rochkind
Thanks Erick and Diego. Yes, I noticed in my last message I'm not 
actually using defaults, not sure why I chose non-defaults originally.


I still need to find time to make a smaller isolation/reproduction case, 
I'm getting confusing results that suggest some other part of my field 
def may be pertinent.


I'll come back when I've done that (hopefully next week), and include 
the _parsed_ from debug=query then. Thanks!


Jonathan


On 9/2/14 4:26 PM, Erick Erickson wrote:

What happens if you append debug=query to your query? IOW, what does the
_parsed_ query look like?

Also note that the defaults for WDFF are _not_ identical. catenateWords and
catenateNumbers are 1 in the
index portion and 0 in the query section. Still, this shouldn't be a
problem all other things being equal.

Best,
Erick


On Tue, Sep 2, 2014 at 12:43 PM, Jonathan Rochkind rochk...@jhu.edu wrote:


On 9/2/14 1:51 PM, Erick Erickson wrote:


bq: In my actual index, query MacBook is matching ONLY mac book, and
not macbook

I suspect your query parameters for WordDelimiterFilterFactory doesn't
have
catenate words set.

What do you see when you enter these in both the index and query portions
of the admin/analysis page?



Thanks Erick!

Our WordDelimiterFilterFactory does have catenate words set, in both index
and query phases (is that right?):

filter class=solr.WordDelimiterFilterFactory generateWordParts=1
generateNumberParts=1 catenateWords=1 catenateNumbers=1
catenateAll=0 splitOnCaseChange=1/

It's hard to cut and paste the results of the analysis page into email (or
anywhere!), I'll give you screenshots, sorry -- and I'll give them for our
whole real world app complex field definition. I'll also paste in our
entire field definition below. But I realize my next step is probably
creating a simpler isolation/reproduction case (unless you have a magic
answer from this!).

Again, the problem is that MacBook seems to be only matching on indexed
macbook and not indexed mac book.


MacBook query analysis:
https://www.dropbox.com/s/b8y11usjdlc88un/mixedcasequery.png

MacBook index analysis:
https://www.dropbox.com/s/fwae3nz4tdtjhjv/mixedcaseindex.png

mac book index analysis:
https://www.dropbox.com/s/mihd58f6zs3rfu8/twowordindex.png


Our entire actual field definition:

   fieldType name=text class=solr.TextField positionIncrementGap=100
autoGeneratePhraseQueries=true
   analyzer
!-- the rulefiles thing is to keep ICUTokenizerFactory from
stripping punctuation,
 so our synonym filter involving C++ etc can still work.
 From: https://mail-archives.apache.
org/mod_mbox/lucene-solr-user/201305.mbox/%3C51965E70.
6070...@elyograg.org%3E
 the rbbi file is in our local ./conf, copied from lucene
source tree --
tokenizer class=solr.ICUTokenizerFactory
rulefiles=Latn:Latin-break-only-on-whitespace.rbbi/

filter class=solr.SynonymFilterFactory 
synonyms=punctuation-whitelist.txt
ignoreCase=true/

 filter class=solr.WordDelimiterFilterFactory
generateWordParts=1 generateNumberParts=1 catenateWords=1
catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/


 !-- folding need sto be after WordDelimiter, so WordDelimiter
  can do it's thing with full cases and such --
 filter class=solr.ICUFoldingFilterFactory /


 !-- ICUFolding already includes lowercasing, no
  need for seperate lowercasing step
 filter class=solr.LowerCaseFilterFactory/
 --

 filter class=solr.SnowballPorterFilterFactory
language=English protected=protwords.txt/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
 /fieldType









Re: How to stop Solr delta import from creating a log file

2014-09-03 Thread Chris Hostetter

: I have solr installed on Debian and every time delta import takes place a
: file gets created in my root directory. The files that get created  look
: like this
: 
: 
:   dataimport?command=delta-import.1

that is exactly the output you would expect to see if you have a cron 
somewhere, running wget against the DIH, as root...


hossman@frisbee:~/tmp/dh$ wget --quiet 
http://localhost:8983/solr/rss/dataimport?command=delta-import;
hossman@frisbee:~/tmp/dh$ ls 
dataimport?command=delta-import
hossman@frisbee:~/tmp/dh$ wget --quiet 
http://localhost:8983/solr/rss/dataimport?command=delta-import;
hossman@frisbee:~/tmp/dh$ wget --quiet 
http://localhost:8983/solr/rss/dataimport?command=delta-import;
hossman@frisbee:~/tmp/dh$ wget --quiet 
http://localhost:8983/solr/rss/dataimport?command=delta-import;
hossman@frisbee:~/tmp/dh$ wget --quiet 
http://localhost:8983/solr/rss/dataimport?command=delta-import;
hossman@frisbee:~/tmp/dh$ ls 
dataimport?command=delta-importdataimport?command=delta-import.3
dataimport?command=delta-import.1  dataimport?command=delta-import.4
dataimport?command=delta-import.2



-Hoss
http://www.lucidworks.com/


How to change search component parameters dynamically using query

2014-09-03 Thread bbarani
Hi,

I use the below highlight search component in one of my request handler.

I am trying to figure out a way to change the value of highlight search
component dynamically from the query. Is it possible to modify the
parameters dynamically using the query (without creating another
searchcomponent)?


 searchComponent class=solr.HighlightComponent name=highlight
highlighting
  boundaryScanner class=solr.highlight.SimpleBoundaryScanner
default=false name=simple
  lst name=defaults
str name=hl.bs.maxScan200/str
str name=hl.bs.chars./str
 /lst
/boundaryScanner
boundaryScanner class=solr.highlight.BreakIteratorBoundaryScanner
default=true name=breakIterator
  lst name=defaults
  str name=hl.bs.typeSENTENCE/str
str name=hl.bs.languageen/str
  str name=hl.bs.countryUS/str
  /lst
   /boundaryScanner
/highlighting



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-change-search-component-parameters-dynamically-using-query-tp4156672.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Importing RDF/XML in Solr

2014-09-03 Thread Mikhail Khludnev
iirc, Lucene In Action describes http://rdelbru.github.io/SIREn/ in the one
of appendixes. I know that they spoke at LuenceRevolution recently. that's
all what I know.


On Wed, Sep 3, 2014 at 2:40 PM, Pragati Meena pme...@bostonanalytics.com
wrote:

   Hi,
 I want to index rdf/xml document into solr. I am attaching the XML input
 document . I want to identify person, location and organization in solr.
 So I have made changes in data-config, schema.xml and added request
 handler in solrconfig.xml. But person, organization, location are not
 indexed into solr.
 Please tell me what is it that I am missing here.

  Thanks
 Pragati Meena
 Big Data Engineer

 *Pragati Meena*

 *Big Data Engineer*



 [image: cid:image001.png@01CA6F51.41A1B030]

 (È)+91-9910584024

 (*)pme...@bostonanalytics.com


 DISCLAIMER: The information contained in this e-mail message or any
 attachments to it may contain confidential or privileged information. If
 you are not the intended recipient, any dissemination, use, review,
 distribution, printing or copying of the information contained in this
 e-mail message and / or attachments to it is strictly prohibited. If you
 have received this communication in error, please notify us by reply e-mail
 and immediately delete the e-mail. Thank you.-




-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re: looking for a solr/search expert in Paris

2014-09-03 Thread elisabeth benoit
Thanks a lot for your answers.

Best regards,
Elisabeth


2014-09-03 17:18 GMT+02:00 Jack Krupansky j...@basetechnology.com:

 Don't forget to check out the Solr Support wiki where consultants
 advertise their services:
 http://wiki.apache.org/solr/Support

 And any Solr or Lucene consultants on this mailing list should be sure
 that they are registered on that support wiki. Hey, it's free! And be
 sure to keep your listing up to date, including regional availability and
 any specialties.

 -- Jack Krupansky

 -Original Message- From: elisabeth benoit
 Sent: Wednesday, September 3, 2014 4:02 AM
 To: solr-user@lucene.apache.org
 Subject: looking for a solr/search expert in Paris


 Hello,


 We are looking for a solr consultant to help us with our devs using solr.
 We've been working on this for a little while, and we feel we need an
 expert point of view on what we're doing, who could give us insights about
 our solr conf, performance issues, error handling issues (big thing). Well
 everything.

 The entreprise is in the Paris (France) area. Any suggestion is welcomed.

 Thanks,
 Elisabeth



Is there a way to modify the request handler parameters dynamically?

2014-09-03 Thread bbarani
Hi,

I need to change the components (inside a request handler) dynamically using
query parameters instead of creating multiple request handlers. Is it
possible to do this on the fly from the query?

For Ex:

change the highlight search component to use different search component
based on a query parameter

requestHandler class=solr.StandardRequestHandler name=/test

 arr name=components
  strfilterbyrole/str
  strlandingPage/str
  strfirstRulesComp/str
  strquery/str
*  strhighlight/str
*  strfacet/str
  strspellcheck/str
  strlastRulesComp/str
  strdebug/str
  strelevator/str
/arr
  /requestHandler




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-modify-the-request-handler-parameters-dynamically-tp4156697.html
Sent from the Solr - User mailing list archive at Nabble.com.


Server is shutting down due to threads

2014-09-03 Thread Ethan
We have SolrCloud instance with 2 solr nodes and 3 zk ensemble.  One of the
solr node goes down as soon as we send search traffic to it, but update
works fine.

When I analyzed thread dump I saw lot of blocked threads with following
error message.  This explains why it couldn't create any native threads and
ran out of memory.  The thread count went from 48 to 900 within minutes and
server came down.  The other node with same configuration is taking all the
search and update traffic, and it running fine.

Any pointers would be appreciated.

http-bio-52158-exec-59 - Thread t@589
   java.lang.Thread.State: BLOCKED on
org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b owned by:
http-bio-52158-exec-61
 at
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:209)
at org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:901)
 at
org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:685)
at
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97)
 at
org.apache.lucene.search.TimeLimitingCollector.setNextReader(TimeLimitingCollector.java:158)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
 at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
at
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1501)
 at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1367)
at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:474)
 at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:434)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
 at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
 at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
at
com.trimp.search.filter.LogAndAuthFilter.execute(LogAndAuthFilter.scala:109)
 at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
 at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
 at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
 at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680)
 at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
 at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
 at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
- locked org.apache.tomcat.util.net.SocketWrapper@5b4530c8
 at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)

   Locked ownable synchronizers:
- locked java.util.concurrent.ThreadPoolExecutor$Worker@63d2720

-E


Re: Server is shutting down due to threads

2014-09-03 Thread Ethan
Forgot to add the source thread thats blocking every other thread


http-bio-52158-exec-61 - Thread t@591
   java.lang.Thread.State: RUNNABLE
 at
org.apache.lucene.search.FieldCacheImpl$Uninvert.uninvert(FieldCacheImpl.java:312)
at
org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(FieldCacheImpl.java:986)
 at
org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:212)
- locked org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b
 at
org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:901)
at
org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:685)
 at
org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97)
at
org.apache.lucene.search.TimeLimitingCollector.setNextReader(TimeLimitingCollector.java:158)
 at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
 at
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1501)
at
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1367)
 at
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:474)
at
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:434)
 at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
 at
com.trimp.search.filter.LogAndAuthFilter.execute(LogAndAuthFilter.scala:109)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
 at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
 at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
 at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
 at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
 at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
 at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009)
 at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
 - locked org.apache.tomcat.util.net.SocketWrapper@7826692
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)

   Locked ownable synchronizers:
- locked java.util.concurrent.ThreadPoolExecutor$Worker@2463aef


On Wed, Sep 3, 2014 at 2:31 PM, Ethan eh198...@gmail.com wrote:

 We have SolrCloud instance with 2 solr nodes and 3 zk ensemble.  One of
 the solr node goes down as soon as we send search traffic to it, but update
 works fine.

 When I analyzed thread dump I saw lot of blocked threads with following
 error message.  This explains why it couldn't create any native threads and
 ran out of memory.  The thread count went from 48 to 900 within minutes and
 server came down.  The other node with same configuration is taking all the
 search and update traffic, and it running fine.

 Any pointers would be appreciated.

 http-bio-52158-exec-59 - Thread t@589
java.lang.Thread.State: BLOCKED on
 org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b owned
 by: http-bio-52158-exec-61
  at
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:209)
 at
 org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:901)
  at
 org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:685)
 at
 org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97)
  at
 org.apache.lucene.search.TimeLimitingCollector.setNextReader(TimeLimitingCollector.java:158)
 at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
 at
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1501)
  at
 

Re: Is there a way to modify the request handler parameters dynamically?

2014-09-03 Thread Ahmet Arslan
Hi,

You can skip certain components. Every component has a name, if you set its 
name to false, it is skipped. Example : facet=false or query=false

but you cannot change order of them. You need a custom RequestHandler for that.

Ahmet



On Wednesday, September 3, 2014 10:12 PM, bbarani bbar...@gmail.com wrote:
Hi,

I need to change the components (inside a request handler) dynamically using
query parameters instead of creating multiple request handlers. Is it
possible to do this on the fly from the query?

For Ex:

change the highlight search component to use different search component
based on a query parameter

requestHandler class=solr.StandardRequestHandler name=/test

arr name=components
  strfilterbyrole/str
  strlandingPage/str
  strfirstRulesComp/str
  strquery/str
*  strhighlight/str
*  strfacet/str
  strspellcheck/str
  strlastRulesComp/str
  strdebug/str
  strelevator/str
/arr
  /requestHandler




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-there-a-way-to-modify-the-request-handler-parameters-dynamically-tp4156697.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: WordDelimiter filter, expanding to multiple words, unexpected results

2014-09-03 Thread Erick Erickson
Jonathan:

If at all possible, delete your collection/data directory (the whole
directory, including data) between runs after you've changed
your schema (at least any of your analysis that pertains to indexing).
Mixing old and new schema definitions can add to the confusion!

Good luck!
Erick

On Wed, Sep 3, 2014 at 8:48 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 Thanks Erick and Diego. Yes, I noticed in my last message I'm not actually
 using defaults, not sure why I chose non-defaults originally.

 I still need to find time to make a smaller isolation/reproduction case, I'm
 getting confusing results that suggest some other part of my field def may
 be pertinent.

 I'll come back when I've done that (hopefully next week), and include the
 _parsed_ from debug=query then. Thanks!

 Jonathan



 On 9/2/14 4:26 PM, Erick Erickson wrote:

 What happens if you append debug=query to your query? IOW, what does the
 _parsed_ query look like?

 Also note that the defaults for WDFF are _not_ identical. catenateWords
 and
 catenateNumbers are 1 in the
 index portion and 0 in the query section. Still, this shouldn't be a
 problem all other things being equal.

 Best,
 Erick


 On Tue, Sep 2, 2014 at 12:43 PM, Jonathan Rochkind rochk...@jhu.edu
 wrote:

 On 9/2/14 1:51 PM, Erick Erickson wrote:

 bq: In my actual index, query MacBook is matching ONLY mac book, and
 not macbook

 I suspect your query parameters for WordDelimiterFilterFactory doesn't
 have
 catenate words set.

 What do you see when you enter these in both the index and query
 portions
 of the admin/analysis page?


 Thanks Erick!

 Our WordDelimiterFilterFactory does have catenate words set, in both
 index
 and query phases (is that right?):

 filter class=solr.WordDelimiterFilterFactory generateWordParts=1
 generateNumberParts=1 catenateWords=1 catenateNumbers=1
 catenateAll=0 splitOnCaseChange=1/

 It's hard to cut and paste the results of the analysis page into email
 (or
 anywhere!), I'll give you screenshots, sorry -- and I'll give them for
 our
 whole real world app complex field definition. I'll also paste in our
 entire field definition below. But I realize my next step is probably
 creating a simpler isolation/reproduction case (unless you have a magic
 answer from this!).

 Again, the problem is that MacBook seems to be only matching on indexed
 macbook and not indexed mac book.


 MacBook query analysis:
 https://www.dropbox.com/s/b8y11usjdlc88un/mixedcasequery.png

 MacBook index analysis:
 https://www.dropbox.com/s/fwae3nz4tdtjhjv/mixedcaseindex.png

 mac book index analysis:
 https://www.dropbox.com/s/mihd58f6zs3rfu8/twowordindex.png


 Our entire actual field definition:

fieldType name=text class=solr.TextField
 positionIncrementGap=100
 autoGeneratePhraseQueries=true
analyzer
 !-- the rulefiles thing is to keep ICUTokenizerFactory from
 stripping punctuation,
  so our synonym filter involving C++ etc can still work.
  From: https://mail-archives.apache.
 org/mod_mbox/lucene-solr-user/201305.mbox/%3C51965E70.
 6070...@elyograg.org%3E
  the rbbi file is in our local ./conf, copied from lucene
 source tree --
 tokenizer class=solr.ICUTokenizerFactory
 rulefiles=Latn:Latin-break-only-on-whitespace.rbbi/

 filter class=solr.SynonymFilterFactory
 synonyms=punctuation-whitelist.txt
 ignoreCase=true/

  filter class=solr.WordDelimiterFilterFactory
 generateWordParts=1 generateNumberParts=1 catenateWords=1
 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/


  !-- folding need sto be after WordDelimiter, so WordDelimiter
   can do it's thing with full cases and such --
  filter class=solr.ICUFoldingFilterFactory /


  !-- ICUFolding already includes lowercasing, no
   need for seperate lowercasing step
  filter class=solr.LowerCaseFilterFactory/
  --

  filter class=solr.SnowballPorterFilterFactory
 language=English protected=protwords.txt/
  filter class=solr.RemoveDuplicatesTokenFilterFactory/
/analyzer
  /fieldType









Re: How to change search component parameters dynamically using query

2014-09-03 Thread Erick Erickson
Depends on which ones. Any parameter in the defaults sections
can be overridden on dynamically, i.e.

.hl.bs.language=fr

Best,
Erick

On Wed, Sep 3, 2014 at 10:38 AM, bbarani bbar...@gmail.com wrote:
 Hi,

 I use the below highlight search component in one of my request handler.

 I am trying to figure out a way to change the value of highlight search
 component dynamically from the query. Is it possible to modify the
 parameters dynamically using the query (without creating another
 searchcomponent)?


  searchComponent class=solr.HighlightComponent name=highlight
 highlighting
   boundaryScanner class=solr.highlight.SimpleBoundaryScanner
 default=false name=simple
   lst name=defaults
 str name=hl.bs.maxScan200/str
 str name=hl.bs.chars./str
  /lst
 /boundaryScanner
 boundaryScanner 
 class=solr.highlight.BreakIteratorBoundaryScanner
 default=true name=breakIterator
   lst name=defaults
   str name=hl.bs.typeSENTENCE/str
 str name=hl.bs.languageen/str
   str name=hl.bs.countryUS/str
   /lst
/boundaryScanner
 /highlighting



 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/How-to-change-search-component-parameters-dynamically-using-query-tp4156672.html
 Sent from the Solr - User mailing list archive at Nabble.com.


Re: Server is shutting down due to threads

2014-09-03 Thread Erick Erickson
Do you have indexing traffic going to it? b/c this _looks_
like the node is just starting up or a searcher is
being opened and you're loading your
index first time. This happens when you index data and
when you start up your nodes. Adding some autowarming
(firstSearcher in this case) might load up the underlying
caches earlier. This could also be a problem due to
very short commit intervals, although this latter should
be identical for both nodes.

And when you say 2 solr nodes, is this one shard or two?

I'm guessing that you have some setting that's significantly
different, memory perhaps?

Best,
Erick



On Wed, Sep 3, 2014 at 2:40 PM, Ethan eh198...@gmail.com wrote:
 Forgot to add the source thread thats blocking every other thread


 http-bio-52158-exec-61 - Thread t@591
java.lang.Thread.State: RUNNABLE
  at
 org.apache.lucene.search.FieldCacheImpl$Uninvert.uninvert(FieldCacheImpl.java:312)
 at
 org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(FieldCacheImpl.java:986)
  at
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:212)
 - locked org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b
  at
 org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:901)
 at
 org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:685)
  at
 org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97)
 at
 org.apache.lucene.search.TimeLimitingCollector.setNextReader(TimeLimitingCollector.java:158)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
 at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
  at
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1501)
 at
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1367)
  at
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:474)
 at
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:434)
  at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
 at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
  at
 com.trimp.search.filter.LogAndAuthFilter.execute(LogAndAuthFilter.scala:109)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
  at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
 at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
  at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
 at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
  at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
 at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
  at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
 at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
  at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680)
 at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
  at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
 at
 org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009)
  at
 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
 at
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
  - locked org.apache.tomcat.util.net.SocketWrapper@7826692
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
  at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:722)

Locked ownable synchronizers:
 - locked java.util.concurrent.ThreadPoolExecutor$Worker@2463aef


 On Wed, Sep 3, 2014 at 2:31 PM, Ethan eh198...@gmail.com wrote:

 We have SolrCloud instance with 2 solr nodes and 3 zk ensemble.  One of
 the solr node goes down as soon as we send search traffic to it, but update
 works fine.

 When I analyzed thread dump I saw lot of blocked threads with following
 error message.  This explains why it couldn't create any native threads and
 ran out of memory.  The thread count went from 48 to 900 within minutes and
 server came down.  The other node with same configuration is taking all the
 search and update traffic, and it running fine.

 Any pointers would be appreciated.

 http-bio-52158-exec-59 - Thread t@589
java.lang.Thread.State: BLOCKED on
 org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b owned
 by: http-bio-52158-exec-61
  at
 

[ANNOUNCE] Apache Lucene 4.10.0 released

2014-09-03 Thread Ryan Ernst
3 September 2014, Apache Lucene™ 4.10.0 available

The Lucene PMC is pleased to announce the release of Apache Lucene 4.10.0

Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for nearly
any application that requires full-text search, especially cross-platform.

The release is available for immediate download at:
  http://lucene.apache.org/core/mirrors-core-latest-redir.html

Lucene 4.10.0 Release Highlights:

* New TermAutomatonQuery using an automaton for proximity queries.
  
http://blog.mikemccandless.com/2014/08/a-new-proximity-query-for-lucene-using.html

* New OrdsBlockTree terms dictionary supporting ord lookup.

* Simplified matchVersion handling for Analyzers with new setVersion
method, as well as Analyzer constructors not requiring Version.

* Fixed possible corruption when opening a 3.x index with NRT reader.

* Fixed edge case in StandardTokenizer that caused extremely slow
parsing times with long text which partially matched grammar rules.

This release contains numerous bug fixes, optimizations, and improvements.
Please read CHANGES.txt for a full list of new features and changes:
  https://lucene.apache.org/core/4_10_0/changes/Changes.html

Please report any feedback to the mailing lists
(http://lucene.apache.org/core/discussion.html)

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases.  It is possible that the mirror you are using
may not have replicated the release yet.  If that is the case, please
try another mirror.  This also goes for Maven access.

On behalf of the Lucene PMC,
Happy Searching


[ANNOUNCE] Apache Solr 4.10.0 released

2014-09-03 Thread Ryan Ernst
3 September 2014, Apache Solr™ 4.10.0 available

The Lucene PMC is pleased to announce the release of Apache Solr 4.10.0

Solr is the popular, blazing fast, open source NoSQL search platform
from the Apache Lucene project. Its major features include powerful
full-text search, hit highlighting, faceted search, dynamic
clustering, database integration, rich document (e.g., Word, PDF)
handling, and geospatial search.  Solr is highly scalable, providing
fault tolerant distributed search and indexing, and powers the search
and navigation features of many of the world's largest internet sites.

Solr 4.10.0 is available for immediate download at:
  http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Solr 4.10.0 Release Highlights:

* This release upgrades Solr Cell's (contrib/extraction) dependency
  on Apache POI to mitigate 2 security vulnerabilities:
http://s.apache.org/solr-cell-security-notice

* Scripts for starting, stopping, and running Solr examples

* Distributed query support for facet.pivot

* Interval Faceting for Doc Values fields

* New terms QParser for efficiently filtering documents by a list of values

Solr 4.10.0 also includes many other new features as well as numerous
optimizations and bugfixes of the corresponding Apache Lucene release.
Please read CHANGES.txt for a full list of new features and changes:
  https://lucene.apache.org/solr/4_10_0/changes/Changes.html

Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases.  It is possible that the mirror you are using
may not have replicated the release yet.  If that is the case, please
try another mirror.  This also goes for Maven access.

On behalf of the Lucene PMC,
Happy Searching


DELETEREPLICA

2014-09-03 Thread Erick Erickson
I'm confused, wondering if it's a mismatch between the docs and the
intent or just a bug or whether I'm just not understanding the point:

The DELETEREPLICA docs say:

Delete a replica from a given collection and shard. If the
corresponding core is up and running the core is unloaded and the
entry is removed from the clusterstate. If the node/core is down, the
entry is taken off the clusterstate and if the core comes up later it
is automatically unregistered.

However, if I do the following:
1 create a follower on nodeX
2 shut down nodeX (at this point, the clusterstate has indicates the
follower is down)
3 issue a DELETEREPLICA for the follower (clusterstate entry for this
follower is removed)
4 restart nodeX (clusterstate shows this node is back, it's visible
in cloud veiw, gets sync'd etc.).

Based on the docs, I didn't expect to see the node present in step 4,
what am I missing?

The core has docs (i.e. it's synched from the leader) etc. So this bit
of the documentation is confusing me: If the node/core is down, the
entry is taken off the clusterstate and if the core comes up later it
is automatically unregistered.

That doesn't square with what I'm seeing so either the docs are wrong
or I'm misunderstanding the intent.

If the node _is_ up, then it's removed from the node and clusterstate
and stays gone.

Personally, I don't particularly like the idea of queueing up the
DELETEREPLICAS for later execution, seems like it's overly complex.
Having the clusterstate info removed if the node is down seems very
useful though.

Thanks,
Erick


Solr add document over 20 times slower after upgrade from 4.0 to 4.9

2014-09-03 Thread Li, Ryan
I have a Solr server  indexes 2500 documents (up to 50MB each, ave 3MB) to Solr 
server. When running on Solr 4.0 I managed to finish index in 3 hours.

However after we upgrade to Solr 4.9, the index need 3 days to finish.

I've done some profiling, numbers I get are:
size figure of document,time for adding to Solr server (4.0), time for 
adding to Solr server (4.9)
1.18,   6 sec,  
 123 sec
2.26   12sec
   444 sec
3.35   18sec
   over 600 sec
9.6546sec   
   timeout.

From what I can see index seems has an o(n) performance for Solr 4.0 and is 
almost o(log n) for Solr 4.9. I also tried to comment out some copied fields 
to narrow down the problem, seems size of the document after index(we copy 
fields and the more fields we copy, the bigger the index size is)  is the 
dominating factor for index time.

Just wondering has any one experience similar problem? Does that sound like a 
bug of Solr or just we have use Solr 4.9 wrong?

Here is one example of  field definition in my schema file.
fieldType name=text_stem class=solr.TextField 
positionIncrementGap=100
analyzer type=index
charFilter class=solr.HTMLStripCharFilterFactory/
charFilter class=solr.PatternReplaceCharFilterFactory 
pattern='+ replacement= / !-- strip off all apostrophe (') characters --
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SynonymFilterFactory expand=true 
ignoreCase=true synonyms=../../resources/type-index-synonyms.txt/
filter class=solr.SnowballPorterFilterFactory 
language=English /
!-- Used to have  language=English - seems this param is 
gone in 4.9 --
filter class=solr.RemoveDuplicatesTokenFilterFactory /
/analyzer
analyzer type=query
charFilter class=solr.HTMLStripCharFilterFactory/
charFilter class=solr.PatternReplaceCharFilterFactory 
pattern='+ replacement= / !-- strip off all apostrophe (') characters --
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.SynonymFilterFactory expand=true 
ignoreCase=true synonyms=../../resources/type-query-colloq-synonyms.txt/
filter class=solr.SnowballPorterFilterFactory 
language=English /
!-- Used to have  language=English - seems this param is 
gone in 4.9 --
filter class=solr.RemoveDuplicatesTokenFilterFactory /
/analyzer
/fieldType
Field:
field name=majorTextSignalStem type=text_stem indexed=true 
stored=false multiValued=true omitNorms=false/
Copy:
 copyField dest=majorTextSignalStem source=majorTextSignalRaw /

Thanks,
Ryan



Re: Server is shutting down due to threads

2014-09-03 Thread Ethan
Erick,

It is just one shard.  Indexing traffic is going to the other node and then
synched with this one(both are part of cloud).  We kept that setting
running for 5 days as defective node would just go down with search
traffic.  So both were in sync when search was turned on.  Soft commit is
very low, around 2 secs, but that doesn't seem to affect the other node
which is functioning normally.

Memory settings for both nodes are identical, including m/c configuration.

On Wed, Sep 3, 2014 at 4:23 PM, Erick Erickson erickerick...@gmail.com
wrote:

 Do you have indexing traffic going to it? b/c this _looks_
 like the node is just starting up or a searcher is
 being opened and you're loading your
 index first time. This happens when you index data and
 when you start up your nodes. Adding some autowarming
 (firstSearcher in this case) might load up the underlying
 caches earlier. This could also be a problem due to
 very short commit intervals, although this latter should
 be identical for both nodes.

 And when you say 2 solr nodes, is this one shard or two?

 I'm guessing that you have some setting that's significantly
 different, memory perhaps?

 Best,
 Erick



 On Wed, Sep 3, 2014 at 2:40 PM, Ethan eh198...@gmail.com wrote:
  Forgot to add the source thread thats blocking every other thread
 
 
  http-bio-52158-exec-61 - Thread t@591
 java.lang.Thread.State: RUNNABLE
   at
 
 org.apache.lucene.search.FieldCacheImpl$Uninvert.uninvert(FieldCacheImpl.java:312)
  at
 
 org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(FieldCacheImpl.java:986)
   at
 
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:212)
  - locked org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b
   at
  org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:901)
  at
 
 org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:685)
   at
 
 org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97)
  at
 
 org.apache.lucene.search.TimeLimitingCollector.setNextReader(TimeLimitingCollector.java:158)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
   at
 
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1501)
  at
 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1367)
   at
 
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:474)
  at
 
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:434)
   at
 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
  at
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
  at
 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
   at
 
 com.trimp.search.filter.LogAndAuthFilter.execute(LogAndAuthFilter.scala:109)
  at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
   at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
  at
 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
   at
 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
  at
 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
   at
 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
  at
 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
   at
 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
  at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
   at
 org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680)
  at
 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
   at
 
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
  at
 
 org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009)
   at
 
 org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
  at
 
 org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
   - locked org.apache.tomcat.util.net.SocketWrapper@7826692
  at
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
  at java.lang.Thread.run(Thread.java:722)
 
 Locked ownable synchronizers:
  - locked java.util.concurrent.ThreadPoolExecutor$Worker@2463aef
 
 
  On Wed, Sep 3, 2014 at 2:31 PM, Ethan eh198...@gmail.com wrote:
 
  We have SolrCloud instance with 2 solr nodes and 3 zk ensemble.  One of
  

Re: Server is shutting down due to threads

2014-09-03 Thread Erick Erickson
Hmmm, I'm puzzled then. I'm guessing that the node
that keeps going down is the follower, which means
it should have _less_ work to do than the node that
stays up. Not a lot less, but less still.

I'd try lengthening out my commit interval. I realize you've
set it to 2 seconds for a reason, this is mostly to see
if it has any effect and have a place to _start_ looking.

I'm assuming your hard commit has openSearcher set to false.

Just to double check, these two nodes are just a leader and
follower, right? IOW, they're part of the same collection,
your collection just has one shard.

m/c configuration? What's that? If it's a typo for m/s
(master/slave) then that may be an issue. In a SolrCloud
setup there is no master/slave and you shouldn't configure
them

Best,
Erick

On Wed, Sep 3, 2014 at 8:52 PM, Ethan eh198...@gmail.com wrote:
 Erick,

 It is just one shard.  Indexing traffic is going to the other node and then
 synched with this one(both are part of cloud).  We kept that setting
 running for 5 days as defective node would just go down with search
 traffic.  So both were in sync when search was turned on.  Soft commit is
 very low, around 2 secs, but that doesn't seem to affect the other node
 which is functioning normally.

 Memory settings for both nodes are identical, including m/c configuration.

 On Wed, Sep 3, 2014 at 4:23 PM, Erick Erickson erickerick...@gmail.com
 wrote:

 Do you have indexing traffic going to it? b/c this _looks_
 like the node is just starting up or a searcher is
 being opened and you're loading your
 index first time. This happens when you index data and
 when you start up your nodes. Adding some autowarming
 (firstSearcher in this case) might load up the underlying
 caches earlier. This could also be a problem due to
 very short commit intervals, although this latter should
 be identical for both nodes.

 And when you say 2 solr nodes, is this one shard or two?

 I'm guessing that you have some setting that's significantly
 different, memory perhaps?

 Best,
 Erick



 On Wed, Sep 3, 2014 at 2:40 PM, Ethan eh198...@gmail.com wrote:
  Forgot to add the source thread thats blocking every other thread
 
 
  http-bio-52158-exec-61 - Thread t@591
 java.lang.Thread.State: RUNNABLE
   at
 
 org.apache.lucene.search.FieldCacheImpl$Uninvert.uninvert(FieldCacheImpl.java:312)
  at
 
 org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(FieldCacheImpl.java:986)
   at
 
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:212)
  - locked org.apache.lucene.search.FieldCache$CreationPlaceholder@29e0400b
   at
  org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:901)
  at
 
 org.apache.lucene.search.FieldComparator$LongComparator.setNextReader(FieldComparator.java:685)
   at
 
 org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:97)
  at
 
 org.apache.lucene.search.TimeLimitingCollector.setNextReader(TimeLimitingCollector.java:158)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:618)
  at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:297)
   at
 
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1501)
  at
 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1367)
   at
 
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:474)
  at
 
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:434)
   at
 
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:208)
  at
 
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
  at
 
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
   at
 
 com.trimp.search.filter.LogAndAuthFilter.execute(LogAndAuthFilter.scala:109)
  at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
   at
 
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
  at
 
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
   at
 
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
  at
 
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
   at
 
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
  at
 
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
   at
 
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
  at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
   at
 org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:680)
  at
 
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
   at