Highlighting integer field

2014-12-11 Thread Pawel Rog
Hi,
Is it possible to highlight int (TrieLongField) or long (TrieLongField)
field in Solr?

--
Paweł


Re: Edismax parser and boosts

2014-10-09 Thread Pawel Rog
Hi,
Thank you for your response.
I checked it in Solr 4.8 but I think this works as I described from very
long time. I'm not 100% sure if it is really bug or not. When I run phrase
query like foo^1.0 bar this works very similarto what happens in edismax
with set *pf* parameter (boost part is not removed).

--
Paweł Róg

On Thu, Oct 9, 2014 at 12:07 AM, Jack Krupansky j...@basetechnology.com
wrote:

 Definitely sounds like a bug! File a Jira. Thanks for reporting this. What
 release of Solr?



 -- Jack Krupansky
 -Original Message- From: Pawel Rog
 Sent: Wednesday, October 8, 2014 3:57 PM
 To: solr-user@lucene.apache.org
 Subject: Edismax parser and boosts


 Hi,
 I use edismax query with q parameter set as below:

 q=foo^1.0+AND+bar

 For such a query for the same document I see different (lower) scoring
 value than for

 q=foo+AND+bar

 By default boost of term is 1 as far as i know so why the scoring differs?

 When I check debugQuery parameter in parsedQuery for foo^1.0+AND+bar I
 see Boolean query which one of clauses is a phrase query foo 1.0 bar. It
 seems that edismax parser takes whole q parameter as a phrase without
 removing boost value and add it as a boolean clause. Is it a bug or it
 should work like that?

 --
 Paweł Róg



Edismax parser and boosts

2014-10-08 Thread Pawel Rog
Hi,
I use edismax query with q parameter set as below:

q=foo^1.0+AND+bar

For such a query for the same document I see different (lower) scoring
value than for

q=foo+AND+bar

By default boost of term is 1 as far as i know so why the scoring differs?

When I check debugQuery parameter in parsedQuery for foo^1.0+AND+bar I
see Boolean query which one of clauses is a phrase query foo 1.0 bar. It
seems that edismax parser takes whole q parameter as a phrase without
removing boost value and add it as a boolean clause. Is it a bug or it
should work like that?

--
Paweł Róg


Contribute QParserPlugin

2014-05-28 Thread Pawel Rog
Hi,
I need QParserPlugin that will use Redis as a backend to prepare filter
queries. There are several data structures available in Redis (hash, set,
etc.). From some reasons I cannot fetch data from redis data structures,
build and send big requests from application. That's why I want to build
that filters on backend (Solr) side.

I'm wondering what do I have to do to contribute QParserPlugin into Solr
repository. Can you suggest me a way (in a few steps) to publish it in Solr
repository, probably as a contrib?

--
Paweł Róg


Solr cloud hangs

2014-02-17 Thread Pawel Rog
Hi,
I have quite annoying problem with Solr cloud. I have a cluster with 8
shards and with 2 replicas in each. (Solr 4.6.1)
After some time cluster doesn't respond to any update requests. Restarting
the cluster nodes doesn't help.

There are a lot of such stack traces (waiting for very long time):


   - sun.misc.Unsafe.park(Native Method)
   - java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
   -
   
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
   -
   org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:342)
   -
   
org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:526)
   -
   
org.eclipse.jetty.util.thread.QueuedThreadPool.access$600(QueuedThreadPool.java:44)
   -
   
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
   - java.lang.Thread.run(Thread.java:722)


Do you have any idea where can I look for?

--
Pawel


Re: Solr cloud hangs

2014-02-17 Thread Pawel Rog
Hi,
Here is the whole stack trace: https://gist.github.com/anonymous/9056783

--
Pawel

On Mon, Feb 17, 2014 at 4:53 PM, Mark Miller markrmil...@gmail.com wrote:

 Can you share the full stack trace dump?

 - Mark

 http://about.me/markrmiller

 On Feb 17, 2014, at 7:07 AM, Pawel Rog pawelro...@gmail.com wrote:

  Hi,
  I have quite annoying problem with Solr cloud. I have a cluster with 8
  shards and with 2 replicas in each. (Solr 4.6.1)
  After some time cluster doesn't respond to any update requests.
 Restarting
  the cluster nodes doesn't help.
 
  There are a lot of such stack traces (waiting for very long time):
 
 
- sun.misc.Unsafe.park(Native Method)
-
 java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
-
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
-
 
 org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:342)
-
 
 org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:526)
-
 
 org.eclipse.jetty.util.thread.QueuedThreadPool.access$600(QueuedThreadPool.java:44)
-
 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
- java.lang.Thread.run(Thread.java:722)
 
 
  Do you have any idea where can I look for?
 
  --
  Pawel




Re: Solr cloud hangs

2014-02-17 Thread Pawel Rog
There are also many errors in solr log like that one:

org.apache.solr.update.StreamingSolrServers$1; error
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for
connection from pool
at
org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(PoolingClientConnectionManager.java:232)
at
org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(PoolingClientConnectionManager.java:199)
at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:456)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer$Runner.run(ConcurrentUpdateSolrServer.java:232)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)


--
Pawel


On Mon, Feb 17, 2014 at 8:01 PM, Pawel Rog pawelro...@gmail.com wrote:

 Hi,
 Here is the whole stack trace: https://gist.github.com/anonymous/9056783

 --
 Pawel


 On Mon, Feb 17, 2014 at 4:53 PM, Mark Miller markrmil...@gmail.comwrote:

 Can you share the full stack trace dump?

 - Mark

 http://about.me/markrmiller

 On Feb 17, 2014, at 7:07 AM, Pawel Rog pawelro...@gmail.com wrote:

  Hi,
  I have quite annoying problem with Solr cloud. I have a cluster with 8
  shards and with 2 replicas in each. (Solr 4.6.1)
  After some time cluster doesn't respond to any update requests.
 Restarting
  the cluster nodes doesn't help.
 
  There are a lot of such stack traces (waiting for very long time):
 
 
- sun.misc.Unsafe.park(Native Method)
-
 java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
-
 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
-
 
 org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:342)
-
 
 org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:526)
-
 
 org.eclipse.jetty.util.thread.QueuedThreadPool.access$600(QueuedThreadPool.java:44)
-
 
 org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
- java.lang.Thread.run(Thread.java:722)
 
 
  Do you have any idea where can I look for?
 
  --
  Pawel





Re: Wildcard query vs facet.prefix for autocomplete?

2012-07-16 Thread Pawel Rog
Maybe try EdgeNgramFilterFactory
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/#solr.EdgeNGramFilterFactory


On Mon, Jul 16, 2012 at 6:57 AM, santamaria2 aravinda@contify.comwrote:

 I'm about to implement an autocomplete mechanism for my search box. I've
 read
 about some of the common approaches, but I have a question about wildcard
 query vs facet.prefix.

 Say I want autocomplete for a title: 'Shadows of the Damned'. I want this
 to
 appear as a suggestion if I type 'sha' or 'dam' or 'the'. I don't care that
 it won't appear if I type 'hadows'.

 While indexing, I'd use a whitespace tokenizer and a lowercase filter to
 store that title in the index.
 Now I'm thinking two approaches for 'dam' typed in the search box:

 1) q=title:dam*

 2) q=*:*facet=onfacet.field=titlefacet.prefix=dam


 So any reason that I should favour one over the other? Speed a factor? The
 index has around 200,000 items.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Wildcard-query-vs-facet-prefix-for-autocomplete-tp3995199.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: FilterCache - maximum size of document set

2012-06-15 Thread Pawel Rog
Thanks
I don't use NOW in queries. All my filters with timestamp are rounded to
hundreds of
seconds to increase hitrate. The only problem could be in price filters
which can be
varied (users are unpredictable :P), but also that filters from fq or
setting cache=false
is also bad idea ... checked it :) Load rised three times :)

--
Pawel

On Fri, Jun 15, 2012 at 1:30 PM, Erick Erickson erickerick...@gmail.comwrote:

 Test first, of course, but slave on 3.6 and master on 3.5 should be
 fine. If you're
 getting evictions with the cache settings that high, you really want
 to look at why.

 Note that in particular, using NOW in your filter queries virtually
 guarantees
 that they won't be re-used as per the link I sent yesterday.

 Best
 Erick

 On Fri, Jun 15, 2012 at 1:15 AM, Pawel Rog pawelro...@gmail.com wrote:
  It can be true that filters cache max size is set to high value. That is
  also true that.
  We looked at evictions and hit rate earlier. Maybe you are right that
  evictions are
  not always unwanted. Some time ago we made tests. There are not so high
  difference in hit rate when filters maxSize is set to 4000 (hit rate
 about
  85%) and
  16000 (hitrate about 91%). I think that also using LFU cache can be
 helpful
  but
  it makes me to migrate to 3.6. Do you think it is reasonable to use
 slave on
  version 3.6 and master on 3.5?
 
  Once again, Thanks for your help
 
  --
  Pawel
 
  On Thu, Jun 14, 2012 at 7:22 PM, Erick Erickson erickerick...@gmail.com
 wrote:
 
  Hmmm, your maxSize is pretty high, it may just be that you've set this
  much higher
  than is wise. The maxSize setting governs the number of entries. I'd
 start
  with
  a much lower number here, and monitor the solr/admin page for both
  hit ratio and evictions. Well, and size too. 16,000 entries puts a
  ceiling of, what,
  48G on it? Ouch! It sounds like what's happening here is you're just
  accumulating
  more and more fqs over the course of the evening and blowing memory.
 
  Not all FQs will be that big, there's some heuristics in there to just
  store the
  document numbers for sparse filters, maxDocs/8 is pretty much the upper
  bound though.
 
  Evictions are not necessarily a bad thing, the hit-ratio is important
  here. And
  if you're using a bare NOW in your filter queries, you're probably never
  re-using them anyway, see:
 
 
 http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/
 
  I really question whether this limit is reasonable, but you know your
  situation best.
 
  Best
  Erick
 
  On Wed, Jun 13, 2012 at 5:40 PM, Pawel Rog pawelro...@gmail.com
 wrote:
   Thanks for your response
   Yes, maybe you are right. I thought that filters can be larger than
 3M.
  All
   kinds of filters uses BitSet?
   Moreover maxSize of filterCache is set to 16000 in my case. There are
   evictions during day traffic
   but not during night traffic.
  
   Version of Solr which I use is 3.5
  
   I haven't used Memory Anayzer yet. Could you write more details about
 it?
  
   --
   Regards,
   Pawel
  
   On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson 
  erickerick...@gmail.comwrote:
  
   Hmmm, I think you may be looking at the wrong thing here. Generally,
 a
   filterCache
   entry will be maxDocs/8 (plus some overhead), so in your case they
  really
   shouldn't be all that large, on the order of 3M/filter. That
 shouldn't
   vary based
   on the number of docs that match the fq, it's just a bitset. To see
 if
   that makes any
   sense, take a look at the admin page and the number of evictions in
   your filterCache. If
   that is  0, you're probably using all the memory you're going to in
   the filterCache during
   the day..
  
   But you haven't indicated what version of Solr you're using, I'm
 going
   from a
   relatively recent 3x knowledge-base.
  
   Have you put a memory analyzer against your Solr instance to see
 where
   the memory
   is being used?
  
   Best
   Erick
  
   On Wed, Jun 13, 2012 at 1:05 PM, Pawel pawelmis...@gmail.com
 wrote:
Hi,
I have solr index with about 25M documents. I optimized FilterCache
  size
   to
reach the best performance (considering traffic characteristic
 that my
   Solr
handles). I see that the only way to limit size of a Filter Cace
 is to
   set
number of document sets that Solr can cache. There is no way to set
   memory
limit (eg. 2GB, 4GB or something like that). When I process a
 standard
trafiic (during day) everything is fine. But when Solr handle night
   traffic
(and the charateristic of requests change) some problems appear.
  There is
JVM out of memory error. I know what is the reason. Some filters on
  some
fields are quite poor filters. They returns 15M of documents or
 even
   more.
You could say 'Just put that into q'. I tried to put that filters
 into
Query part but then, the statistics of request processing time
  (during
day) become much worse. Reduction of Filter Cache maxSize

Re: FilterCache - maximum size of document set

2012-06-14 Thread Pawel Rog
It can be true that filters cache max size is set to high value. That is
also true that.
We looked at evictions and hit rate earlier. Maybe you are right that
evictions are
not always unwanted. Some time ago we made tests. There are not so high
difference in hit rate when filters maxSize is set to 4000 (hit rate about
85%) and
16000 (hitrate about 91%). I think that also using LFU cache can be helpful
but
it makes me to migrate to 3.6. Do you think it is reasonable to use slave on
version 3.6 and master on 3.5?

Once again, Thanks for your help

--
Pawel

On Thu, Jun 14, 2012 at 7:22 PM, Erick Erickson erickerick...@gmail.comwrote:

 Hmmm, your maxSize is pretty high, it may just be that you've set this
 much higher
 than is wise. The maxSize setting governs the number of entries. I'd start
 with
 a much lower number here, and monitor the solr/admin page for both
 hit ratio and evictions. Well, and size too. 16,000 entries puts a
 ceiling of, what,
 48G on it? Ouch! It sounds like what's happening here is you're just
 accumulating
 more and more fqs over the course of the evening and blowing memory.

 Not all FQs will be that big, there's some heuristics in there to just
 store the
 document numbers for sparse filters, maxDocs/8 is pretty much the upper
 bound though.

 Evictions are not necessarily a bad thing, the hit-ratio is important
 here. And
 if you're using a bare NOW in your filter queries, you're probably never
 re-using them anyway, see:

 http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/

 I really question whether this limit is reasonable, but you know your
 situation best.

 Best
 Erick

 On Wed, Jun 13, 2012 at 5:40 PM, Pawel Rog pawelro...@gmail.com wrote:
  Thanks for your response
  Yes, maybe you are right. I thought that filters can be larger than 3M.
 All
  kinds of filters uses BitSet?
  Moreover maxSize of filterCache is set to 16000 in my case. There are
  evictions during day traffic
  but not during night traffic.
 
  Version of Solr which I use is 3.5
 
  I haven't used Memory Anayzer yet. Could you write more details about it?
 
  --
  Regards,
  Pawel
 
  On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson 
 erickerick...@gmail.comwrote:
 
  Hmmm, I think you may be looking at the wrong thing here. Generally, a
  filterCache
  entry will be maxDocs/8 (plus some overhead), so in your case they
 really
  shouldn't be all that large, on the order of 3M/filter. That shouldn't
  vary based
  on the number of docs that match the fq, it's just a bitset. To see if
  that makes any
  sense, take a look at the admin page and the number of evictions in
  your filterCache. If
  that is  0, you're probably using all the memory you're going to in
  the filterCache during
  the day..
 
  But you haven't indicated what version of Solr you're using, I'm going
  from a
  relatively recent 3x knowledge-base.
 
  Have you put a memory analyzer against your Solr instance to see where
  the memory
  is being used?
 
  Best
  Erick
 
  On Wed, Jun 13, 2012 at 1:05 PM, Pawel pawelmis...@gmail.com wrote:
   Hi,
   I have solr index with about 25M documents. I optimized FilterCache
 size
  to
   reach the best performance (considering traffic characteristic that my
  Solr
   handles). I see that the only way to limit size of a Filter Cace is to
  set
   number of document sets that Solr can cache. There is no way to set
  memory
   limit (eg. 2GB, 4GB or something like that). When I process a standard
   trafiic (during day) everything is fine. But when Solr handle night
  traffic
   (and the charateristic of requests change) some problems appear.
 There is
   JVM out of memory error. I know what is the reason. Some filters on
 some
   fields are quite poor filters. They returns 15M of documents or even
  more.
   You could say 'Just put that into q'. I tried to put that filters into
   Query part but then, the statistics of request processing time
 (during
   day) become much worse. Reduction of Filter Cache maxSize is also not
  good
   solution because during day cache filters are very very helpful.
   You could be interested in type of filters that I use. These are range
   filters (I tried standard range filters and frange) - eg. price:[* TO
   1]. Some fq with price can return few thousands of results (eg.
   price:[40 TO 50]), but some (eg. price:[* TO 1]) can return
 milions
  of
   documents. I'd also like to avoid solution which will introduce strict
   ranges that user can choose.
   Have you any suggestions what can I do? Is there any way to limit for
   example maximum size of docSet which is cached in FilterCache?
  
   --
   Pawel
 



Re: FilterCache - maximum size of document set

2012-06-13 Thread Pawel Rog
Thanks for your response
Yes, maybe you are right. I thought that filters can be larger than 3M. All
kinds of filters uses BitSet?
Moreover maxSize of filterCache is set to 16000 in my case. There are
evictions during day traffic
but not during night traffic.

Version of Solr which I use is 3.5

I haven't used Memory Anayzer yet. Could you write more details about it?

--
Regards,
Pawel

On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson erickerick...@gmail.comwrote:

 Hmmm, I think you may be looking at the wrong thing here. Generally, a
 filterCache
 entry will be maxDocs/8 (plus some overhead), so in your case they really
 shouldn't be all that large, on the order of 3M/filter. That shouldn't
 vary based
 on the number of docs that match the fq, it's just a bitset. To see if
 that makes any
 sense, take a look at the admin page and the number of evictions in
 your filterCache. If
 that is  0, you're probably using all the memory you're going to in
 the filterCache during
 the day..

 But you haven't indicated what version of Solr you're using, I'm going
 from a
 relatively recent 3x knowledge-base.

 Have you put a memory analyzer against your Solr instance to see where
 the memory
 is being used?

 Best
 Erick

 On Wed, Jun 13, 2012 at 1:05 PM, Pawel pawelmis...@gmail.com wrote:
  Hi,
  I have solr index with about 25M documents. I optimized FilterCache size
 to
  reach the best performance (considering traffic characteristic that my
 Solr
  handles). I see that the only way to limit size of a Filter Cace is to
 set
  number of document sets that Solr can cache. There is no way to set
 memory
  limit (eg. 2GB, 4GB or something like that). When I process a standard
  trafiic (during day) everything is fine. But when Solr handle night
 traffic
  (and the charateristic of requests change) some problems appear. There is
  JVM out of memory error. I know what is the reason. Some filters on some
  fields are quite poor filters. They returns 15M of documents or even
 more.
  You could say 'Just put that into q'. I tried to put that filters into
  Query part but then, the statistics of request processing time (during
  day) become much worse. Reduction of Filter Cache maxSize is also not
 good
  solution because during day cache filters are very very helpful.
  You could be interested in type of filters that I use. These are range
  filters (I tried standard range filters and frange) - eg. price:[* TO
  1]. Some fq with price can return few thousands of results (eg.
  price:[40 TO 50]), but some (eg. price:[* TO 1]) can return milions
 of
  documents. I'd also like to avoid solution which will introduce strict
  ranges that user can choose.
  Have you any suggestions what can I do? Is there any way to limit for
  example maximum size of docSet which is cached in FilterCache?
 
  --
  Pawel



Re: Difference between two solr indexes

2012-04-17 Thread Pawel Rog
If there are only 100'000 documents dump all document ids and make diff
If you're using linux based system you can just use simple tools to do it.
Something like that can be helpful

curl http://your.hostA:port/solr/index/select?*:*fl=idwt=csv;  /tmp/idsA
curl http://your.hostB:port/solr/index/select?*:*fl=idwt=csv;  /tmp/idsB
diff /tmp/idsA /tmp/idsB | grep \| | awk '{print $2;}' | sed
's/\(.*\)/id\1\/id/g'  /tmp/ids_to_delete.xml

Now you have file. Now you can just add to that file delete and
/detele and upload that file into solr using curl
curl -X POST -d @/tmp/ids_to_delete.xml http://your.hostA:port
/solr/index/upadte

On Tue, Apr 17, 2012 at 2:09 PM, nutchsolruser nutchsolru...@gmail.comwrote:

 I'm Also seeking solution for similar problem.

 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Difference-between-two-solr-indexes-tp3916328p3917050.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: solr hangs

2012-04-11 Thread Pawel Rog
You wrote that you can see such error OutOfMemoryError. I had such
problems when my caches were to big. It means that there is no more free
memory in JVM and probably full gc starts running. How big is your Java
heap? Maybe cache sizes in yout solr are to big according to your JVM
settings.

--
Regards,
Pawel

On Tue, Apr 10, 2012 at 9:51 PM, Peter Markey sudoma...@gmail.com wrote:

 Hello,

 I have a solr cloud setup based on a blog (
 http://outerthought.org/blog/491-ot.html) and am able to bring up the
 instances and cores. But when I start indexing data (through csv update),
 the core throws a out of memory exception (null:java.lang.RuntimeException:
 java.lang.OutOfMemoryError: unable to create new native thread). The thread
 dump from new solr ui is below:

 cmdDistribExecutor-8-thread-777 (827)


 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1bd11b79

   - sun.misc.Unsafe.park​(Native Method)
   - java.util.concurrent.locks.LockSupport.park​(LockSupport.java:186)
   -

 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await
 (AbstractQueuedSynchronizer.java:2043)
   -

 org.apache.http.impl.conn.tsccm.WaitingThread.await​(WaitingThread.java:158)
   -
   org.apache.http.impl.conn.tsccm.ConnPoolByRoute.getEntryBlocking
 (ConnPoolByRoute.java:403)
   -
   org.apache.http.impl.conn.tsccm.ConnPoolByRoute$1.getPoolEntry
 (ConnPoolByRoute.java:300)
   -

 org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager$1.getConnection
 (ThreadSafeClientConnManager.java:224)
   -
   org.apache.http.impl.client.DefaultRequestDirector.execute
 (DefaultRequestDirector.java:401)
   -
   org.apache.http.impl.client.AbstractHttpClient.execute
 (AbstractHttpClient.java:820)
   -
   org.apache.http.impl.client.AbstractHttpClient.execute
 (AbstractHttpClient.java:754)
   -
   org.apache.http.impl.client.AbstractHttpClient.execute
 (AbstractHttpClient.java:732)
   -
   org.apache.solr.client.solrj.impl.HttpSolrServer.request
 (HttpSolrServer.java:304)
   -
   org.apache.solr.client.solrj.impl.HttpSolrServer.request
 (HttpSolrServer.java:209)
   -
   org.apache.solr.update.SolrCmdDistributor$1.call
 (SolrCmdDistributor.java:320)
   -
   org.apache.solr.update.SolrCmdDistributor$1.call
 (SolrCmdDistributor.java:301)
   - java.util.concurrent.FutureTask$Sync.innerRun​(FutureTask.java:334)
   - java.util.concurrent.FutureTask.run​(FutureTask.java:166)
   -
   java.util.concurrent.Executors$RunnableAdapter.call​(Executors.java:471)
   - java.util.concurrent.FutureTask$Sync.innerRun​(FutureTask.java:334)
   - java.util.concurrent.FutureTask.run​(FutureTask.java:166)
   -
   java.util.concurrent.ThreadPoolExecutor.runWorker
 (ThreadPoolExecutor.java:1110)
   -
   java.util.concurrent.ThreadPoolExecutor$Worker.run
 (ThreadPoolExecutor.java:603)
   - java.lang.Thread.run​(Thread.java:679)



 Apparently I do see lots of threads like above in the thread dump. I'm
 using latest build from the trunk (Apr 10th). Any insights into this issue
 woudl be really helpful. Thanks a lot.



Re: Usage of * as a first character in wild card query

2012-03-25 Thread Pawel Rog
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ReversedWildcardFilterFactory

On Mon, Mar 26, 2012 at 7:08 AM, Ishan isan.fu...@germinait.com wrote:

 Hi,

 I need to query on solr with * as a first character in query.
 For eg. Content indexed in*  Be careful
 *and query i want to fire is  **ful
 *But solr does not allow * as  a first character in wildcard query.
 Plz let me know if there is any other alternative for doing this*.
 *
 --
 Thanks  Regards,
 Isan Fulia.



Re: Boosting terms

2012-03-19 Thread Pawel Rog
Thanks a lot, I'll read it :) It seems to be helpfull

On Sun, Mar 18, 2012 at 8:58 PM, Ahmet Arslan iori...@yahoo.com wrote:

 Is there any possibility to boost
 terms during indexing? Searching
 that using google I found information that there is no such
 feature in
 Solr (we can only boost fields). Is it true?

 Yes, only field and document boosting exist.

 You might find this article interesting.

 http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/




Re: Help with duplicate unique IDs

2012-03-02 Thread Pawel Rog
Once I had the same problem. I didn't know what's going on. After few
moment of analysis I created completely new index and removed old one
(I hadn't enough time to analyze problem). Problem didn't come back
any more.

--
Regards,
Pawel

On Fri, Mar 2, 2012 at 8:23 PM, Thomas Dowling tdowl...@ohiolink.edu wrote:
 In a Solr index of journal articles, I thought I was safe reindexing
 articles because their unique ID would cause the new record in the index to
 overwrite the old one. (As stated at
 http://wiki.apache.org/solr/SchemaXml#The_Unique_Key_Field - right?)

 My schema.xml includes:

 fields...
 field name=id type=string indexed=true stored=true
  required=true/
 .../fields

 And:

 uniqueKeyid/uniqueKey

 And yet I can compose a query with two hits in the index, showing:

 #1: str name=id03405443/v66i0003/347_mrirtaitmbpa/str
 #2: str name=id03405443/v66i0003/347_mrirtaitmbpa/str


 Can anyone give pointers on where I'm screwing something up?


 Thomas Dowling
 thomas.dowl...@gmail.com


Re: Realtime profile data

2012-02-07 Thread Pawel Rog
Thank you. I'll try NRT and some post-filter :)


On Tue, Feb 7, 2012 at 3:09 PM, Erick Erickson erickerick...@gmail.com wrote:
 You have several options:
 1 if you can go to trunk (bleeding edge, I admit), you can
     get into the near real time (NRT) stuff.
 2 You could maintain essentially a post-filter step where
      your app maintains a list of deleted messages and
     removes them from the response. This will cause
     some of your counts (e.g. facets, grouping) to be slightly
     off
 3 Train your users to expect whatever latency you've
      built into the system (i.e. indexing, commit and replication)

 Best
 Erick

 On Mon, Feb 6, 2012 at 10:42 AM, Pawel Rog pawelro...@gmail.com wrote:
 Hello. I have some problem which i'd like to solve using solr. I have
 user profile which has some kind of messages in it. User can filter
 messages, sort them etc. The problem is with delete operation. If user
 click on message to delete it it's very hard to update index of solr
 in real time. When user deletes message, it will be still visible.
 Have you idea how to solve problem with removing data?


Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Pawel Rog
* 1st question (ls from index directory)

solr 1.4

-rw-r--r-- 1 user user2180582 Nov 30 07:26 _3g1_cf.del
-rw-r--r-- 1 user user 5190652802 Nov 28 17:57 _3g1.fdt
-rw-r--r-- 1 user user  139556724 Nov 28 17:57 _3g1.fdx
-rw-r--r-- 1 user user   4963 Nov 28 17:56 _3g1.fnm
-rw-r--r-- 1 user user 1879006175 Nov 28 18:01 _3g1.frq
-rw-r--r-- 1 user user  513919573 Nov 28 18:01 _3g1.prx
-rw-r--r-- 1 user user2745451 Nov 28 18:01 _3g1.tii
-rw-r--r-- 1 user user  218731810 Nov 28 18:01 _3g1.tis
-rw-r--r-- 1 user user 275268 Nov 30 07:26 _3uu_1a.del
-rw-r--r-- 1 user user  666375513 Nov 30 03:35 _3uu.fdt
-rw-r--r-- 1 user user   17616636 Nov 30 03:35 _3uu.fdx
-rw-r--r-- 1 user user   4884 Nov 30 03:35 _3uu.fnm
-rw-r--r-- 1 user user  243847897 Nov 30 03:35 _3uu.frq
-rw-r--r-- 1 user user   64791316 Nov 30 03:35 _3uu.prx
-rw-r--r-- 1 user user 545317 Nov 30 03:35 _3uu.tii
-rw-r--r-- 1 user user   42993472 Nov 30 03:35 _3uu.tis
-rw-r--r-- 1 user user   1178 Nov 30 07:26 _3wj_1.del
-rw-r--r-- 1 user user2813124 Nov 30 07:26 _3wj.fdt
-rw-r--r-- 1 user user  74852 Nov 30 07:26 _3wj.fdx
-rw-r--r-- 1 user user   2175 Nov 30 07:26 _3wj.fnm
-rw-r--r-- 1 user user 911051 Nov 30 07:26 _3wj.frq
-rw-r--r-- 1 user user  4 Nov 30 07:26 _3wj.nrm
-rw-r--r-- 1 user user 285405 Nov 30 07:26 _3wj.prx
-rw-r--r-- 1 user user   7951 Nov 30 07:26 _3wj.tii
-rw-r--r-- 1 user user 624702 Nov 30 07:26 _3wj.tis
-rw-r--r-- 1 user user   35859092 Nov 30 07:26 _3wk.fdt
-rw-r--r-- 1 user user 958148 Nov 30 07:26 _3wk.fdx
-rw-r--r-- 1 user user   4104 Nov 30 07:26 _3wk.fnm
-rw-r--r-- 1 user user   12228212 Nov 30 07:26 _3wk.frq
-rw-r--r-- 1 user user3438508 Nov 30 07:26 _3wk.prx
-rw-r--r-- 1 user user  58672 Nov 30 07:26 _3wk.tii
-rw-r--r-- 1 user user4621519 Nov 30 07:26 _3wk.tis
-rw-r--r-- 1 user user  0 Nov 30 07:27
lucene-9445a367a714cc9bf70d0ebdf83b9e01-write.lock
-rw-r--r-- 1 user user   1010 Nov 30 07:26 segments_2tr
-rw-r--r-- 1 user user 20 Nov 17 14:06 segments.gen

solr 3.5 (dates are older - because I turned off feeding 3.5 instance)

-rw-r--r-- 1 user user2188376 Nov 29 13:10 _2x_6g.del
-rw-r--r-- 1 user user 4955406209 Nov 28 17:38 _2x.fdt
-rw-r--r-- 1 user user  140054140 Nov 28 17:38 _2x.fdx
-rw-r--r-- 1 user user   4852 Nov 28 17:37 _2x.fnm
-rw-r--r-- 1 user user 1845719205 Nov 28 17:42 _2x.frq
-rw-r--r-- 1 user user  497871055 Nov 28 17:42 _2x.prx
-rw-r--r-- 1 user user3006635 Nov 28 17:42 _2x.tii
-rw-r--r-- 1 user user  230304265 Nov 28 17:42 _2x.tis
-rw-r--r-- 1 user user  50128 Nov 29 13:10 _5s_48.del
-rw-r--r-- 1 user user  116159640 Nov 29 00:25 _5s.fdt
-rw-r--r-- 1 user user3206268 Nov 29 00:25 _5s.fdx
-rw-r--r-- 1 user user   4963 Nov 29 00:25 _5s.fnm
-rw-r--r-- 1 user user   44556139 Nov 29 00:25 _5s.frq
-rw-r--r-- 1 user user   11405232 Nov 29 00:25 _5s.prx
-rw-r--r-- 1 user user 149965 Nov 29 00:25 _5s.tii
-rw-r--r-- 1 user user   11662163 Nov 29 00:25 _5s.tis
-rw-r--r-- 1 user user  63191 Nov 29 13:10 _97_1o.del
-rw-r--r-- 1 user user  145482785 Nov 29 08:08 _97.fdt
-rw-r--r-- 1 user user4042300 Nov 29 08:08 _97.fdx
-rw-r--r-- 1 user user   4963 Nov 29 08:08 _97.fnm
-rw-r--r-- 1 user user   55361299 Nov 29 08:08 _97.frq
-rw-r--r-- 1 user user   14181208 Nov 29 08:08 _97.prx
-rw-r--r-- 1 user user 187731 Nov 29 08:08 _97.tii
-rw-r--r-- 1 user user   14617940 Nov 29 08:08 _97.tis
-rw-r--r-- 1 user user  21310 Nov 29 13:10 _9q_1a.del
-rw-r--r-- 1 user user   49864395 Nov 29 09:19 _9q.fdt
-rw-r--r-- 1 user user1361884 Nov 29 09:19 _9q.fdx
-rw-r--r-- 1 user user   4963 Nov 29 09:19 _9q.fnm
-rw-r--r-- 1 user user   17879364 Nov 29 09:19 _9q.frq
-rw-r--r-- 1 user user4970178 Nov 29 09:19 _9q.prx
-rw-r--r-- 1 user user  75969 Nov 29 09:19 _9q.tii
-rw-r--r-- 1 user user5932085 Nov 29 09:19 _9q.tis
-rw-r--r-- 1 user user   62661357 Nov 29 10:19 _a6.fdt
-rw-r--r-- 1 user user1717820 Nov 29 10:19 _a6.fdx
-rw-r--r-- 1 user user   4963 Nov 29 10:19 _a6.fnm
-rw-r--r-- 1 user user   23283028 Nov 29 10:19 _a6.frq
-rw-r--r-- 1 user user6196945 Nov 29 10:19 _a6.prx
-rw-r--r-- 1 user user  92528 Nov 29 10:19 _a6.tii
-rw-r--r-- 1 user user7209783 Nov 29 10:19 _a6.tis
-rw-r--r-- 1 user user  26871 Nov 29 13:10 _a6_y.del
-rw-r--r-- 1 user user   16372020 Nov 29 10:39 _ab.fdt
-rw-r--r-- 1 user user 455476 Nov 29 10:39 _ab.fdx
-rw-r--r-- 1 user user   4963 Nov 29 10:39 _ab.fnm
-rw-r--r-- 1 user user6025966 Nov 29 10:39 _ab.frq
-rw-r--r-- 1 user user1622841 Nov 29 10:39 _ab.prx
-rw-r--r-- 1 user user  35252 Nov 29 10:39 _ab.tii
-rw-r--r-- 1 user user2766468 Nov 29 10:39 _ab.tis
-rw-r--r-- 1 user user   7147 Nov 29 13:10 _ab_u.del
-rw-r--r-- 1 user user   14818116 Nov 29 11:09 _aj.fdt
-rw-r--r-- 1 user user 409356 Nov 29 11:09 _aj.fdx
-rw-r--r-- 1 user user   4963 Nov 29 11:09 _aj.fnm
-rw-r--r-- 1 user user5461353 

Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Pawel Rog
I attach chart which presents cpu usage. Solr 3.5 uses almost all cpu
(left side of chart).
at the begining of chart there was about 60rps and about 100rps
(before turning off solr 3.5). Then there was 1.4 turned on with
100rps.

--
Pawel

On Wed, Nov 30, 2011 at 9:07 AM, Pawel Rog pawelro...@gmail.com wrote:
 * 1st question (ls from index directory)

 solr 1.4

 -rw-r--r-- 1 user user    2180582 Nov 30 07:26 _3g1_cf.del
 -rw-r--r-- 1 user user 5190652802 Nov 28 17:57 _3g1.fdt
 -rw-r--r-- 1 user user  139556724 Nov 28 17:57 _3g1.fdx
 -rw-r--r-- 1 user user       4963 Nov 28 17:56 _3g1.fnm
 -rw-r--r-- 1 user user 1879006175 Nov 28 18:01 _3g1.frq
 -rw-r--r-- 1 user user  513919573 Nov 28 18:01 _3g1.prx
 -rw-r--r-- 1 user user    2745451 Nov 28 18:01 _3g1.tii
 -rw-r--r-- 1 user user  218731810 Nov 28 18:01 _3g1.tis
 -rw-r--r-- 1 user user     275268 Nov 30 07:26 _3uu_1a.del
 -rw-r--r-- 1 user user  666375513 Nov 30 03:35 _3uu.fdt
 -rw-r--r-- 1 user user   17616636 Nov 30 03:35 _3uu.fdx
 -rw-r--r-- 1 user user       4884 Nov 30 03:35 _3uu.fnm
 -rw-r--r-- 1 user user  243847897 Nov 30 03:35 _3uu.frq
 -rw-r--r-- 1 user user   64791316 Nov 30 03:35 _3uu.prx
 -rw-r--r-- 1 user user     545317 Nov 30 03:35 _3uu.tii
 -rw-r--r-- 1 user user   42993472 Nov 30 03:35 _3uu.tis
 -rw-r--r-- 1 user user       1178 Nov 30 07:26 _3wj_1.del
 -rw-r--r-- 1 user user    2813124 Nov 30 07:26 _3wj.fdt
 -rw-r--r-- 1 user user      74852 Nov 30 07:26 _3wj.fdx
 -rw-r--r-- 1 user user       2175 Nov 30 07:26 _3wj.fnm
 -rw-r--r-- 1 user user     911051 Nov 30 07:26 _3wj.frq
 -rw-r--r-- 1 user user          4 Nov 30 07:26 _3wj.nrm
 -rw-r--r-- 1 user user     285405 Nov 30 07:26 _3wj.prx
 -rw-r--r-- 1 user user       7951 Nov 30 07:26 _3wj.tii
 -rw-r--r-- 1 user user     624702 Nov 30 07:26 _3wj.tis
 -rw-r--r-- 1 user user   35859092 Nov 30 07:26 _3wk.fdt
 -rw-r--r-- 1 user user     958148 Nov 30 07:26 _3wk.fdx
 -rw-r--r-- 1 user user       4104 Nov 30 07:26 _3wk.fnm
 -rw-r--r-- 1 user user   12228212 Nov 30 07:26 _3wk.frq
 -rw-r--r-- 1 user user    3438508 Nov 30 07:26 _3wk.prx
 -rw-r--r-- 1 user user      58672 Nov 30 07:26 _3wk.tii
 -rw-r--r-- 1 user user    4621519 Nov 30 07:26 _3wk.tis
 -rw-r--r-- 1 user user          0 Nov 30 07:27
 lucene-9445a367a714cc9bf70d0ebdf83b9e01-write.lock
 -rw-r--r-- 1 user user       1010 Nov 30 07:26 segments_2tr
 -rw-r--r-- 1 user user         20 Nov 17 14:06 segments.gen

 solr 3.5 (dates are older - because I turned off feeding 3.5 instance)

 -rw-r--r-- 1 user user    2188376 Nov 29 13:10 _2x_6g.del
 -rw-r--r-- 1 user user 4955406209 Nov 28 17:38 _2x.fdt
 -rw-r--r-- 1 user user  140054140 Nov 28 17:38 _2x.fdx
 -rw-r--r-- 1 user user       4852 Nov 28 17:37 _2x.fnm
 -rw-r--r-- 1 user user 1845719205 Nov 28 17:42 _2x.frq
 -rw-r--r-- 1 user user  497871055 Nov 28 17:42 _2x.prx
 -rw-r--r-- 1 user user    3006635 Nov 28 17:42 _2x.tii
 -rw-r--r-- 1 user user  230304265 Nov 28 17:42 _2x.tis
 -rw-r--r-- 1 user user      50128 Nov 29 13:10 _5s_48.del
 -rw-r--r-- 1 user user  116159640 Nov 29 00:25 _5s.fdt
 -rw-r--r-- 1 user user    3206268 Nov 29 00:25 _5s.fdx
 -rw-r--r-- 1 user user       4963 Nov 29 00:25 _5s.fnm
 -rw-r--r-- 1 user user   44556139 Nov 29 00:25 _5s.frq
 -rw-r--r-- 1 user user   11405232 Nov 29 00:25 _5s.prx
 -rw-r--r-- 1 user user     149965 Nov 29 00:25 _5s.tii
 -rw-r--r-- 1 user user   11662163 Nov 29 00:25 _5s.tis
 -rw-r--r-- 1 user user      63191 Nov 29 13:10 _97_1o.del
 -rw-r--r-- 1 user user  145482785 Nov 29 08:08 _97.fdt
 -rw-r--r-- 1 user user    4042300 Nov 29 08:08 _97.fdx
 -rw-r--r-- 1 user user       4963 Nov 29 08:08 _97.fnm
 -rw-r--r-- 1 user user   55361299 Nov 29 08:08 _97.frq
 -rw-r--r-- 1 user user   14181208 Nov 29 08:08 _97.prx
 -rw-r--r-- 1 user user     187731 Nov 29 08:08 _97.tii
 -rw-r--r-- 1 user user   14617940 Nov 29 08:08 _97.tis
 -rw-r--r-- 1 user user      21310 Nov 29 13:10 _9q_1a.del
 -rw-r--r-- 1 user user   49864395 Nov 29 09:19 _9q.fdt
 -rw-r--r-- 1 user user    1361884 Nov 29 09:19 _9q.fdx
 -rw-r--r-- 1 user user       4963 Nov 29 09:19 _9q.fnm
 -rw-r--r-- 1 user user   17879364 Nov 29 09:19 _9q.frq
 -rw-r--r-- 1 user user    4970178 Nov 29 09:19 _9q.prx
 -rw-r--r-- 1 user user      75969 Nov 29 09:19 _9q.tii
 -rw-r--r-- 1 user user    5932085 Nov 29 09:19 _9q.tis
 -rw-r--r-- 1 user user   62661357 Nov 29 10:19 _a6.fdt
 -rw-r--r-- 1 user user    1717820 Nov 29 10:19 _a6.fdx
 -rw-r--r-- 1 user user       4963 Nov 29 10:19 _a6.fnm
 -rw-r--r-- 1 user user   23283028 Nov 29 10:19 _a6.frq
 -rw-r--r-- 1 user user    6196945 Nov 29 10:19 _a6.prx
 -rw-r--r-- 1 user user      92528 Nov 29 10:19 _a6.tii
 -rw-r--r-- 1 user user    7209783 Nov 29 10:19 _a6.tis
 -rw-r--r-- 1 user user      26871 Nov 29 13:10 _a6_y.del
 -rw-r--r-- 1 user user   16372020 Nov 29 10:39 _ab.fdt
 -rw-r--r-- 1 user user     455476 Nov 29 10:39 _ab.fdx
 -rw-r--r-- 1 user user       4963 Nov 29 10:39 _ab.fnm
 -rw-r--r-- 1 user user    6025966 Nov 29 10:39 _ab.frq
 -rw-r--r-- 1 user user

Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Pawel Rog
I made thread dump. Most active threads have such trace:

471003383@qtp-536357250-245 - Thread t@270
   java.lang.Thread.State: RUNNABLE
at 
org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:702)
at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1144)
at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:362)
at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:378)
at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)


On Wed, Nov 30, 2011 at 10:31 AM, Pawel Rog pawelro...@gmail.com wrote:
 I attach chart which presents cpu usage. Solr 3.5 uses almost all cpu
 (left side of chart).
 at the begining of chart there was about 60rps and about 100rps
 (before turning off solr 3.5). Then there was 1.4 turned on with
 100rps.

 --
 Pawel

 On Wed, Nov 30, 2011 at 9:07 AM, Pawel Rog pawelro...@gmail.com wrote:
 * 1st question (ls from index directory)

 solr 1.4

 -rw-r--r-- 1 user user    2180582 Nov 30 07:26 _3g1_cf.del
 -rw-r--r-- 1 user user 5190652802 Nov 28 17:57 _3g1.fdt
 -rw-r--r-- 1 user user  139556724 Nov 28 17:57 _3g1.fdx
 -rw-r--r-- 1 user user       4963 Nov 28 17:56 _3g1.fnm
 -rw-r--r-- 1 user user 1879006175 Nov 28 18:01 _3g1.frq
 -rw-r--r-- 1 user user  513919573 Nov 28 18:01 _3g1.prx
 -rw-r--r-- 1 user user    2745451 Nov 28 18:01 _3g1.tii
 -rw-r--r-- 1 user user  218731810 Nov 28 18:01 _3g1.tis
 -rw-r--r-- 1 user user     275268 Nov 30 07:26 _3uu_1a.del
 -rw-r--r-- 1 user user  666375513 Nov 30 03:35 _3uu.fdt
 -rw-r--r-- 1 user user   17616636 Nov 30 03:35 _3uu.fdx
 -rw-r--r-- 1 user user       4884 Nov 30 03:35 _3uu.fnm
 -rw-r--r-- 1 user user  243847897 Nov 30 03:35 _3uu.frq
 -rw-r--r-- 1 user user   64791316 Nov 30 03:35 _3uu.prx
 -rw-r--r-- 1 user user     545317 Nov 30 03:35 _3uu.tii
 -rw-r--r-- 1 user user   42993472 Nov 30 03:35 _3uu.tis
 -rw-r--r-- 1 user user       1178 Nov 30 07:26 _3wj_1.del
 -rw-r--r-- 1 user user    2813124 Nov 30 07:26 _3wj.fdt
 -rw-r--r-- 1 user user      74852 Nov 30 07:26 _3wj.fdx
 -rw-r--r-- 1 user user       2175 Nov 30 07:26 _3wj.fnm
 -rw-r--r-- 1 user user     911051 Nov 30 07:26 _3wj.frq
 -rw-r--r-- 1 user user          4 Nov 30 07:26 _3wj.nrm
 -rw-r--r-- 1 user user     285405 Nov 30 07:26 _3wj.prx
 -rw-r--r-- 1 user user       7951 Nov 30 07:26 _3wj.tii
 -rw-r--r-- 1 user user     624702 Nov 30 07:26 _3wj.tis
 -rw-r--r-- 1 user user   35859092 Nov 30 07:26 _3wk.fdt
 -rw-r--r-- 1 user user     958148 Nov 30 07:26 _3wk.fdx
 -rw-r--r-- 1 user user       4104 Nov 30 07:26 _3wk.fnm
 -rw-r--r-- 1 user user   12228212 Nov 30 07:26 _3wk.frq
 -rw-r--r-- 1 user user    3438508 Nov 30 07:26 _3wk.prx
 -rw-r--r-- 1 user user      58672 Nov 30 07:26 _3wk.tii
 -rw-r--r-- 1 user user    4621519 Nov 30 07:26 _3wk.tis
 -rw-r--r-- 1 user user          0 Nov 30 07:27
 lucene-9445a367a714cc9bf70d0ebdf83b9e01-write.lock
 -rw-r--r-- 1 user user       1010 Nov 30 07:26 segments_2tr
 -rw-r--r-- 1 user user         20 Nov 17 14:06 segments.gen

 solr 3.5 (dates are older - because I turned off feeding 3.5 instance)


Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Pawel Rog
http://imageshack.us/photo/my-images/838/cpuusage.png/

On Wed, Nov 30, 2011 at 9:18 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : I attach chart which presents cpu usage. Solr 3.5 uses almost all cpu
 : (left side of chart).

 FWIW: The mailing list software filters out most attachments (there are
 some exceptions for certain text mime types)


 -Hoss


Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Pawel Rog
On Wed, Nov 30, 2011 at 9:05 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : I tried to use index from 1.4 (load was the same as on index from 3.5)
 : but there was problem with synchronization with master (invalid
 : javabin format)
 : Then I built new index on 3.5 with luceneMatchVersion LUCENE_35

 why would you need to re-replicate from the master?

 You already have a copy of the Solr 1.4 index on the slave machine where
 you are doing testing correct? Just (make sure Solr 1.4 isn't running
 and) point Solr 3.5 at that solr home directory for the configs and data
 and time that.  (Just because Solr 3.5 can't replicate from Solr 1.4
 over HTTP doesn't mean it can't open indexes built by Solr 1.4)


I made It before sending earlier e-mail. Efect was the same.

 It's important to understand if the discrepencies you are seeing have to
 do with *building* the index under Solr 3.5, or *searching* in Solr 3.5.

 : reader : 
 SolrIndexReader{this=8cca36c,r=ReadOnlyDirectoryReader@8cca36c,refCnt=1,segments=4}
 : readerDir : 
 org.apache.lucene.store.NIOFSDirectory@/data/solr_data/itemsfull/index
 :
 : solr 3.5
 : reader : 
 SolrIndexReader{this=3d01e178,r=ReadOnlyDirectoryReader@3d01e178,refCnt=1,segments=14}
 : readerDir : 
 org.apache.lucene.store.MMapDirectory@/data/solr_data_350/itemsfull/index
 : lockFactory=org.apache.lucene.store.NativeFSLockFactory@294ce5eb

 As mentioned, the difference in the number of segments may be contributing
 to the perf differences you are seeing, so optimizing both indexes (or
 doing a partial optimize of your 3.5 index down to 4 segments) for
 comparison would probably be worthwhile.  (and if that is the entirety of
 hte problem, then explicitly configuring a MergePolicy may help you in the
 long run)

 but independent of that I would like to suggest that you first try
 explicitly configuring Solr 3.5 to use NIOFSDirectory so it's consistent
 with what Solr 1.4 was doing (I'm told MMapDirectory should be faster, but
 maybe there's something about your setup that makes that not true) So it
 would be helpful to also try adding this to your 3.5 solrconfig.xml and
 testing ...

 directoryFactory name=DirectoryFactory class=solr.NIOFSDirectoryFactory/

 : I made some test with quiet heavy query (with frange). In both cases
 : (1.4 and 3.5) I used the same newSearcher queries and started solr
 : without any load.
 : Results of debug timing

 Ok, well ... honestly: giving us *one* example of the timing data for
 *one* query (w/o even telling us what the exact query was) ins't really
 anything we can use to help you ... the crux of the question was: was the
 slow performance you are seeing only under heavy load or was it also slow
 when you did manual testing?

 : When I send fewer than 60 rps I see that in comparsion to 1.4 median
 : response time is worse, avarage is worse but maximum time is better.
 : It doesn't change propotion of cpu usage (3.5 uses much more cpu).

 How much fewer then 60 rps ? ... I'm trying to understand if the
 problems you are seeing are solely happening under heavy concurrent
 load, or if you are seeing Solr 3.5 consistently respond much slower then
 Solr 1.4 even with a single client?

 Also: I may still be missunderstanding how you are generating load, and
 wether you are throttling the clients, but seeing higher CPU utilization
 in Solr 3.5 isn't neccessarily an indication of something going wrong --
 in some cases higher CPU% (particularly under heavy concurrent load on a
 multi-core machine) could just mean that Solr is now capable of utilizing
 more CPU to process parallel request, where as previous versions might have
 been hitting other bottle necks. -- but that doesn't explain the slower
 response times. that's what concerns me the most.

I don't think that 1200% CPU usage with the same traffic is better
then 200%. I think you are wrong :) Using solr 1.4 I can reach 300rps
and then reach 1200% on cpu and only 60rps in solr 3.5


 FWIW: I'm still wondering what the stats from your caches wound up looking
 like on both Solr 1.4 and Solr 3.5...

 7) What do the cache stats look like on your Solr 3.5 instance after
 you've done some of this timing testing?  the output of...
 http://localhost:8983/solr/admin/mbeans?cat=CACHEstats=truewt=jsonindent=true
 ...would be helpful. NOTE: you may need to add this to your
 solrconfig.xml
 for that URL to work...
  requestHandler name=/admin/ class=solr.admin.AdminHandlers /'

 ...but i don't think /admin/mbeans exists in Solr 1.4, so you may just
 have to get the details from stats.jsp.


I forgot to write it earlier. QueryCache hit rate was about 0.03 (in
solr 1.4 and 3.5). Filter cache hitrate was abaout 0.35 in both cases.
Document hit rate was about 0.55 in both cases.

Trace from thread wasn't helpful to diagnose problem? As I mentioned
before - almost all threads were in the same line of code in
SolrIndexSearcher.


Re: Solr 3.5 very slow (performance)

2011-11-30 Thread Pawel Rog
Yes it works. Thanks a lot.
But I stil don't understand why in solr 1.4 that option was efficient
but in solr 3.5 not

On Wed, Nov 30, 2011 at 11:01 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
 On Wed, Nov 30, 2011 at 7:08 AM, Pawel Rog pawelro...@gmail.com wrote:
        at 
 org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:702)
        at 
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1144)
        at 
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:362)

 This is interesting, and suggests that you have
 useFilterForSortedQuery set in your solrconfig.xml
 Can you try removing it (or setting it to false)?

 -Yonik
 http://www.lucidimagination.com


Re: Solr 3.5 very slow (performance)

2011-11-29 Thread Pawel Rog
examples

facet=truesort=promoted+desc,ending+asc,b_count+descfacet.mincount=1start=0q=name:(kurtka+skóry+brazowe42)facet.limit=500facet.field=cat1facet.field=cat2wt=jsonrows=50

facet=truesort=promoted+desc,ending+asc,b_count+descfacet.mincount=1start=1350q=name:naczepafacet.limit=500facet.field=cat1facet.field=cat2wt=jsonrows=50

facet=truesort=promoted+desc,ending+asc,b_count+descfacet.mincount=1start=0q=it_name:(miłosz+giedroyc)facet.limit=500facet.field=cat1facet.field=cat2wt=jsonrows=50

default operation ANDpromoted - intending - intb_count - intname -
textcat1 - intcat2 -int
these are only few examples. almost all queries are much slower. there
was about 60 searches per second on old and new version of solr. solr
1.4 reached 200% cpu utilization and solr 3.5 reached 1200% cpu
utilization on same machine

On Tue, Nov 29, 2011 at 7:05 PM, Yonik Seeley
yo...@lucidimagination.com wrote:
 On Tue, Nov 29, 2011 at 12:25 PM, Pawel pawelmis...@gmail.com wrote:
 I've build index on solr 1.4 some time ago (about 18milions documents,
 about 8GB). I need new features from newer version of solr, so i
 decided to upgrade solr version from 1.4 to 3.5.

 * I created new solr master on new physical machine
 * then I created new index using the same schema as in earlier version
 * then I indexed some slave, and start sending the same requests as
 earlier but to newer version of solr (3.5, but the same situation is
 on solr 3.4).

 The CPU went from 200% to 1200% and load went from 3 to 15. Avarage
 QTime went from 15ms to 180ms and median went from 1ms to 150ms
 I didn't change any parameters in solrconfig and schema.

 What are the requests that look slower?

 -Yonik
 http://www.lucidimagination.com


Re: Solr 3.5 very slow (performance)

2011-11-29 Thread Pawel Rog
in my last pos i mean
default operation AND
promoted - int
ending - int
b_count - int
name - text
cat1 - int
cat2 - int

On Tue, Nov 29, 2011 at 7:54 PM, Pawel Rog pawelro...@gmail.com wrote:
 examples

 facet=truesort=promoted+desc,ending+asc,b_count+descfacet.mincount=1start=0q=name:(kurtka+skóry+brazowe42)facet.limit=500facet.field=cat1facet.field=cat2wt=jsonrows=50

 facet=truesort=promoted+desc,ending+asc,b_count+descfacet.mincount=1start=1350q=name:naczepafacet.limit=500facet.field=cat1facet.field=cat2wt=jsonrows=50

 facet=truesort=promoted+desc,ending+asc,b_count+descfacet.mincount=1start=0q=it_name:(miłosz+giedroyc)facet.limit=500facet.field=cat1facet.field=cat2wt=jsonrows=50

 default operation ANDpromoted - intending - intb_count - intname -
 textcat1 - intcat2 -int
 these are only few examples. almost all queries are much slower. there
 was about 60 searches per second on old and new version of solr. solr
 1.4 reached 200% cpu utilization and solr 3.5 reached 1200% cpu
 utilization on same machine

 On Tue, Nov 29, 2011 at 7:05 PM, Yonik Seeley
 yo...@lucidimagination.com wrote:
 On Tue, Nov 29, 2011 at 12:25 PM, Pawel pawelmis...@gmail.com wrote:
 I've build index on solr 1.4 some time ago (about 18milions documents,
 about 8GB). I need new features from newer version of solr, so i
 decided to upgrade solr version from 1.4 to 3.5.

 * I created new solr master on new physical machine
 * then I created new index using the same schema as in earlier version
 * then I indexed some slave, and start sending the same requests as
 earlier but to newer version of solr (3.5, but the same situation is
 on solr 3.4).

 The CPU went from 200% to 1200% and load went from 3 to 15. Avarage
 QTime went from 15ms to 180ms and median went from 1ms to 150ms
 I didn't change any parameters in solrconfig and schema.

 What are the requests that look slower?

 -Yonik
 http://www.lucidimagination.com


Re: Solr 3.5 very slow (performance)

2011-11-29 Thread Pawel Rog
On Tue, Nov 29, 2011 at 9:13 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 Let's back up a minute and cover some basics...

 1) You said that you built a brand new index on a brand new master server,
 using Solr 3.5 -- how do you build your indexes?  did the source data
 change at all? does your new index have the same number of docs as your
 previous Solr 1.4 index?  what does a directory listing (including file
 sizes) look like for both your old and new indexes?

Yes, both indexes have same data. Indexes are build using some C++
programm which reads data from database and inserts it into Solr
(using XML). Both indexes have about 8GB size and 18milions documents.


 2) Did you try using your Solr 1.4 index (and configs) directly in Solr
 3.5 w/o rebuilding from scratch?

Yes I used the same configs in solr 1.4 and solr 3.5 (adding only line
about luceneMatchVersion)
As I see in example of solr 3.5 in repository (solrconfig.xml) there
are not many diffrences.

 3) You said you build the new index on a new mmachine, but then you said
 you used a slave where the performanne was worse then Solr 1.4 on the
 same machine ... are you running both the Solr 1.4 and Solr 3.5 instances
 concurrently on your slave machine?  How much physical ram is on that
 machine? what JVM options are using when running the Solr 3.5 instance?
 what servlet container are you using?

Mayby I didn't wrote precisely enough. I have some machine on which
there is master node. I have second machine on which there is slave. I
tested solr 1.4 on that machine, then turned it off and turned on
solr-3.5. I have 36GB RAM on that machine.
On both - solr 1.4 and 3.5 configuration of JVM is the same, and the
same servlet container ... jetty-6

JVM options: -server -Xms12000m -Xmx12000m -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:NewSize=1500m -XX:ParallelGCThreads=8
-XX:CMSInitiatingOccupancyFraction=60

 4) what does your request handler configuration look like?  do you have
 any default/invariant/appended request params?

requestHandler name=standard class=solr.SearchHandler default=true
lst name=defaults
str name=echoParamsexplicit/str
/lst
/requestHandler
requestHandler name=/admin/
class=org.apache.solr.handler.admin.AdminHandlers /
requestHandler name=/replication class=solr.ReplicationHandler 
lst name=slave
!--fully qualified url for the replication handler of 
master . It
is possible to pass on this as a request param for the
fetchindexommand--
str 
name=masterUrlhttp://${masterHost}:${masterPort}/solr-3.5/${solr.core.instanceDir}replication/str
str name=pollInterval00:00:02/str
str name=httpConnTimeout5000/str
str name=httpReadTimeout1/str
 /lst
/requestHandler


 5) The descriptions youve given of how the performance has changed sound
 like you are doing concurrent load testing -- did you do cache warming before 
 you
 started your testing?  how many client threads are hitting the solr server
 at one time?

Maybe I wasn't precise enough again. CPU on solr 1.4 was 200% and on
solr 3.5 1200%
yes there is cache warming. There are 50-100 client threads on both
1.4 and 3.5. There are about 60 requests per second on 3.5 and on 1.4,
but on 3.5 responses are slower and CPU usage much higher.

 6) have you tried doing some basic manual testing to see how individual
 requests performe?  ie: single client at a time, loading a URL, then
 request the same URL again to verify that your Solr caches are in use and
 the QTime is low.  If you see slow respone times even when manually
 executing single requests at a time, have you tried using debug=timing
 to see which serach components are contributing the most to the slow
 QTimes?

Most time is in org.apache.solr.handler.component.QueryComponent and
org.apache.solr.handler.component.DebugComponent in process. I didn't
comare individual request performance.

 7) What do the cache stats look like on your Solr 3.5 instance after
 you've done some of this timing testing?  the output of...
 http://localhost:8983/solr/admin/mbeans?cat=CACHEstats=truewt=jsonindent=true
 ...would be helpful. NOTE: you may need to add this to your solrconfig.xml
 for that URL to work...
  requestHandler name=/admin/ class=solr.admin.AdminHandlers /'


Will check it :)


 : in my last pos i mean
 : default operation AND
 : promoted - int
 : ending - int
 : b_count - int
 : name - text
 : cat1 - int
 : cat2 - int
 :
 : On Tue, Nov 29, 2011 at 7:54 PM, Pawel Rog pawelro...@gmail.com wrote:
 :  examples
 : 
 :  
 facet=truesort=promoted+desc,ending+asc,b_count+descfacet.mincount=1start=0q=name:(kurtka+skóry+brazowe42)facet.limit=500facet.field=cat1facet.field=cat2wt=jsonrows=50
 : 
 :  
 facet=truesort=promoted+desc,ending+asc,b_count+descfacet.mincount=1start=1350q=name:naczepafacet.limit=500facet.field=cat1facet.field=cat2wt=jsonrows=50
 : 
 :  
 facet=truesort=promoted+desc,ending+asc

Re: Solr 3.5 very slow (performance)

2011-11-29 Thread Pawel Rog
IO waits about 0-2%
Didn't see any suspicious activity in logs, but I can check it again

On Tue, Nov 29, 2011 at 11:40 PM, Darren Govoni dar...@ontrenet.com wrote:
 Any suspicous activity in the logs? what about disk activity?


 On 11/29/2011 05:22 PM, Pawel Rog wrote:

 On Tue, Nov 29, 2011 at 9:13 PM, Chris Hostetter
 hossman_luc...@fucit.org  wrote:

 Let's back up a minute and cover some basics...

 1) You said that you built a brand new index on a brand new master
 server,
 using Solr 3.5 -- how do you build your indexes?  did the source data
 change at all? does your new index have the same number of docs as your
 previous Solr 1.4 index?  what does a directory listing (including file
 sizes) look like for both your old and new indexes?

 Yes, both indexes have same data. Indexes are build using some C++
 programm which reads data from database and inserts it into Solr
 (using XML). Both indexes have about 8GB size and 18milions documents.


 2) Did you try using your Solr 1.4 index (and configs) directly in Solr
 3.5 w/o rebuilding from scratch?

 Yes I used the same configs in solr 1.4 and solr 3.5 (adding only line
 about luceneMatchVersion)
 As I see in example of solr 3.5 in repository (solrconfig.xml) there
 are not many diffrences.

 3) You said you build the new index on a new mmachine, but then you said
 you used a slave where the performanne was worse then Solr 1.4 on the
 same machine ... are you running both the Solr 1.4 and Solr 3.5
 instances
 concurrently on your slave machine?  How much physical ram is on that
 machine? what JVM options are using when running the Solr 3.5 instance?
 what servlet container are you using?

 Mayby I didn't wrote precisely enough. I have some machine on which
 there is master node. I have second machine on which there is slave. I
 tested solr 1.4 on that machine, then turned it off and turned on
 solr-3.5. I have 36GB RAM on that machine.
 On both - solr 1.4 and 3.5 configuration of JVM is the same, and the
 same servlet container ... jetty-6

 JVM options: -server -Xms12000m -Xmx12000m -XX:+UseParNewGC
 -XX:+UseConcMarkSweepGC -XX:NewSize=1500m -XX:ParallelGCThreads=8
 -XX:CMSInitiatingOccupancyFraction=60

 4) what does your request handler configuration look like?  do you have
 any default/invariant/appended request params?

 requestHandler name=standard class=solr.SearchHandler default=true
        lst name=defaults
        str name=echoParamsexplicit/str
        /lst
 /requestHandler
 requestHandler name=/admin/
 class=org.apache.solr.handler.admin.AdminHandlers /
 requestHandler name=/replication class=solr.ReplicationHandler
        lst name=slave
                        !--fully qualified url for the replication handler
 of master . It
 is possible to pass on this as a request param for the
 fetchindexommand--
                str
 name=masterUrlhttp://${masterHost}:${masterPort}/solr-3.5/${solr.core.instanceDir}replication/str
                str name=pollInterval00:00:02/str
                str name=httpConnTimeout5000/str
                str name=httpReadTimeout1/str
        /lst
 /requestHandler


 5) The descriptions youve given of how the performance has changed sound
 like you are doing concurrent load testing -- did you do cache warming
 before you
 started your testing?  how many client threads are hitting the solr
 server
 at one time?

 Maybe I wasn't precise enough again. CPU on solr 1.4 was 200% and on
 solr 3.5 1200%
 yes there is cache warming. There are 50-100 client threads on both
 1.4 and 3.5. There are about 60 requests per second on 3.5 and on 1.4,
 but on 3.5 responses are slower and CPU usage much higher.

 6) have you tried doing some basic manual testing to see how individual
 requests performe?  ie: single client at a time, loading a URL, then
 request the same URL again to verify that your Solr caches are in use and
 the QTime is low.  If you see slow respone times even when manually
 executing single requests at a time, have you tried using debug=timing
 to see which serach components are contributing the most to the slow
 QTimes?

 Most time is in org.apache.solr.handler.component.QueryComponent and
 org.apache.solr.handler.component.DebugComponent in process. I didn't
 comare individual request performance.

 7) What do the cache stats look like on your Solr 3.5 instance after
 you've done some of this timing testing?  the output of...

 http://localhost:8983/solr/admin/mbeans?cat=CACHEstats=truewt=jsonindent=true
 ...would be helpful. NOTE: you may need to add this to your
 solrconfig.xml
 for that URL to work...
  requestHandler name=/admin/ class=solr.admin.AdminHandlers /'

 Will check it :)

 : in my last pos i mean
 : default operation AND
 : promoted - int
 : ending - int
 : b_count - int
 : name - text
 : cat1 - int
 : cat2 - int
 :
 : On Tue, Nov 29, 2011 at 7:54 PM, Pawel Rogpawelro...@gmail.com
  wrote:
 :  examples
 :
 :
  facet=truesort=promoted+desc,ending+asc,b_count+descfacet.mincount=1start=0q