Can we manipulate termfreq to count as 1 for multiple matches?

2013-03-13 Thread roz dev
Hi All

I am wondering if there is a way to alter term frequency of a certain field
as 1, even if there are multiple matches in that document?

Use Case is:

Let's say that I have a document with 2 fields

- Name and
- Description

And, there is a document with data like this

Document_1
Name = Blue Jeans
Description = This jeans is very soft.  Jeans is pretty nice.

Now, If I Search for Jeans then Jeans is found in 2 places in
Description field.

Term Frequency for Description is 2

I want Solr to count term frequency for Description as 1 even if Jeans is
found multiple times in this field.

For all other fields, i do want to get the term frequency, as it is.

Is this doable in Solr with any of the functions?

Any inputs are welcome.

Thanks
Saroj


Re: can we configure spellcheck to be invoked after request processing?

2013-03-04 Thread roz dev
James,

You are right. I was setting up spell checker incorrectly.

It works correctly as you described.

Spell checker is invoked after the query component and it does not stop
Solr from executing query.

Thanks for correcting me.
Saroj





On Fri, Mar 1, 2013 at 7:30 AM, Dyer, James james.d...@ingramcontent.comwrote:

 I'm a little confused here because if you are searching q=jeap OR denim ,
 then you should be getting both documents back.  Having spellcheck
 configured does not affect your search results at all.  Having it in your
 request will sometime result in spelling suggestions, usually if one or
 more terms you queried is not in the index.  But if all of your query terms
 are optional then you need only have 1 term match anything to get results.
  You should get the same results regardless of whether or not you have
 spellcheck in the request.

 While spellcheck does not affect your query results, the results do affect
 spellcheck.  This is why you should put spellcheck in the last-components
 section of your request handler configuration.  This ensures that the query
 is run before spellcheck.

 James Dyer
 Ingram Content Group
 (615) 213-4311


 -Original Message-
 From: roz dev [mailto:rozde...@gmail.com]
 Sent: Thursday, February 28, 2013 6:33 PM
 To: solr-user@lucene.apache.org
 Subject: can we configure spellcheck to be invoked after request
 processing?

 Hi All,
 I may be asking a stupid question but please bear with me.

 Is it possible to configure Spell check to be invoked after Solr has
 processed the original query?

 My use case is :

 I am using DirectSpellChecker and have a document which has Denim as a
 term and there is another document which has Jeap.

 I am issuing a Search as Jean or Denim

 I am finding that this Solr query is giving me ZERO results and suggesting
 Jeap as an alternative.

 I want Solr to try to run the query for Jean or Denim and if there are
 no results found then only suggest Jeap as an alternative

 Is this doable in Solr?

 Any suggestions.

 -Saroj




Re: How to re-read the config files in Solr, on a commit

2012-11-06 Thread roz dev
Erick

We have a requirement where seach admin can add or remove some synonyms and
would want these changes to be reflected in search thereafter.

yes, we looked at reload command and it seems to be suitable for that
purpose. We have a master and slave setup so it should be OK to issue
reload command on master. I expect that slaves will pull the latest config
files.

Is reload operation very costly, in terms of time and cpu? We have a
multicore setup and would need to issue reload on multiple cores.

Thanks
Saroj


On Tue, Nov 6, 2012 at 5:02 AM, Erick Erickson erickerick...@gmail.comwrote:

 Not that I know of. This would be extremely expensive in the usual case.
 Loading up configs, reconfiguring all the handlers etc. would add a huge
 amount of overhead to the commit operation, which is heavy enough as it is.

 What's the use-case here? Changing your configs really often and reading
 them on commit sounds like a way to make for a very confusing application!

 But if you really need to re-read all this info on a running system,
 consider the core admin RELOAD command.

 Best
 Erick


 On Mon, Nov 5, 2012 at 8:43 PM, roz dev rozde...@gmail.com wrote:

  Hi All
 
  I am keen to find out if Solr exposes any event listener or other hooks
  which can be used to re-read configuration files.
 
 
  I know that we have firstSearcher event but I am not sure if it causes
  request handlers to reload themselves and read the conf files again.
 
  For example, if I change the synonym file and solr gets a commit, will it
  re-initialize request handlers and re-read the conf files.
 
  Or, are there some events which can be listened to?
 
  Any inputs are welcome.
 
  Thanks
  Saroj
 



Re: How to re-read the config files in Solr, on a commit

2012-11-06 Thread roz dev
Thanks Otis for pointing this out.

We may end up using search time synonyms for single word synonym and use
index time synonym for multi world synonyms.

-Saroj


On Tue, Nov 6, 2012 at 8:09 PM, Otis Gospodnetic otis.gospodne...@gmail.com
 wrote:

 Hi,

 Note about modifying synonyms - you need to reindex, really, if using
 index-time synonyms. And if you're using search-time synonyms you have
 multi-word synonym issue described on the Wiki.

 Otis
 --
 Performance Monitoring - http://sematext.com/spm
 On Nov 6, 2012 11:02 PM, roz dev rozde...@gmail.com wrote:

  Erick
 
  We have a requirement where seach admin can add or remove some synonyms
 and
  would want these changes to be reflected in search thereafter.
 
  yes, we looked at reload command and it seems to be suitable for that
  purpose. We have a master and slave setup so it should be OK to issue
  reload command on master. I expect that slaves will pull the latest
 config
  files.
 
  Is reload operation very costly, in terms of time and cpu? We have a
  multicore setup and would need to issue reload on multiple cores.
 
  Thanks
  Saroj
 
 
  On Tue, Nov 6, 2012 at 5:02 AM, Erick Erickson erickerick...@gmail.com
  wrote:
 
   Not that I know of. This would be extremely expensive in the usual
 case.
   Loading up configs, reconfiguring all the handlers etc. would add a
 huge
   amount of overhead to the commit operation, which is heavy enough as it
  is.
  
   What's the use-case here? Changing your configs really often and
 reading
   them on commit sounds like a way to make for a very confusing
  application!
  
   But if you really need to re-read all this info on a running system,
   consider the core admin RELOAD command.
  
   Best
   Erick
  
  
   On Mon, Nov 5, 2012 at 8:43 PM, roz dev rozde...@gmail.com wrote:
  
Hi All
   
I am keen to find out if Solr exposes any event listener or other
 hooks
which can be used to re-read configuration files.
   
   
I know that we have firstSearcher event but I am not sure if it
 causes
request handlers to reload themselves and read the conf files again.
   
For example, if I change the synonym file and solr gets a commit,
 will
  it
re-initialize request handlers and re-read the conf files.
   
Or, are there some events which can be listened to?
   
Any inputs are welcome.
   
Thanks
Saroj
   
  
 



Re: How to change the boost of fields in edismx at runtime

2012-11-05 Thread roz dev
Thanks Hoss.

Yes, that approach would work as I can change the query.

Is there a way to extend the Edismax Handler to read a config file at
startup and then use some events like commit to instruct edismax handler to
re-read the config file.

That way, I can ensure that my boost params are just on on Solr Servers'
config files and If I need to change, I would just change the file and wait
for commit to re-read the file.

Any inputs?

-Saroj


On Thu, Nov 1, 2012 at 2:50 PM, Chris Hostetter hossman_luc...@fucit.orgwrote:


 : Then, If I find that results are not of my liking then I would like to
 : change the boost as following
 :
 : - Title - boosted to 2
 : -Keyword - boosted to 10
 :
 : Is there any way to change this boost, at run-time, without having to
 : restart solr with new boosts in edismax?

 edismax field boosts (specified in the qf and pf params) can always be
 specified at runtime -- first and foremost they are query params.

 when you put then in your solrconfig.xml file those are just as defaults
 (or invariants, or appends) of those query params.



 -Hoss



Re: SolrJ - IOException

2012-09-24 Thread roz dev
I have seen this happening

We retry and that works. Is your solr server stalled?

On Mon, Sep 24, 2012 at 4:50 PM, balaji.gandhi
balaji.gan...@apollogrp.eduwrote:

 Hi,

 I am encountering this error randomly (under load) when posting to Solr
 using SolrJ.

 Has anyone encountered a similar error?

 org.apache.solr.client.solrj.SolrServerException: IOException occured when
 talking to server at: http://localhost:8080/solr/profile at

 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:414)
 at

 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:182)
 at

 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
 at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:122) at
 org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:107) at

 Thanks,
 Balaji



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/SolrJ-IOException-tp4010026.html
 Sent from the Solr - User mailing list archive at Nabble.com.



IndexDocValues in Solr

2012-08-03 Thread roz dev
Changing the Subject Line to make it easier to understand the topic of the
message

is there any plan to expose IndexDocValues as part of Solr 4?

Any thoughts?

-Saroj

On Thu, Aug 2, 2012 at 5:10 PM, roz dev rozde...@gmail.com wrote:

 As we all know, FIeldCache can be costly if we have lots of documents and
 lots of fields to sort on.
 I see that IndexDocValues are better at sorting and faceting, w.r.t Memory
 usage

 Is there any plan to use IndexDocValues in SOLR for doing sorting and
 faceting?

 Will SOLR 4 or 5 have indexDocValues? Is there an easy way to use
 IndexDocValues in Solr even though it is not implemented yet?

 -Saroj




Re: Memory leak?? with CloseableThreadLocal with use of Snowball Filter

2012-08-01 Thread roz dev
Thanks Robert for these inputs.

Since we do not really Snowball analyzer for this field, we would not use
it for now. If this still does not address our issue, we would tweak thread
pool as per eks dev suggestion - I am bit hesitant to do this change yet as
we would be reducing thread pool which can adversely impact our throughput

If Snowball Filter is being optimized for Solr 4 beta then it would be
great for us. If you have already filed a JIRA for this then please let me
know and I would like to follow it

Thanks again
Saroj





On Wed, Aug 1, 2012 at 8:37 AM, Robert Muir rcm...@gmail.com wrote:

 On Tue, Jul 31, 2012 at 2:34 PM, roz dev rozde...@gmail.com wrote:
  Hi All
 
  I am using Solr 4 from trunk and using it with Tomcat 6. I am noticing
 that
  when we are indexing lots of data with 16 concurrent threads, Heap grows
  continuously. It remains high and ultimately most of the stuff ends up
  being moved to Old Gen. Eventually, Old Gen also fills up and we start
  getting into excessive GC problem.

 Hi: I don't claim to know anything about how tomcat manages threads,
 but really you shouldnt have all these objects.

 In general snowball stemmers should be reused per-thread-per-field.
 But if you have a lot of fields*threads, especially if there really is
 high thread churn on tomcat, then this could be bad with snowball:
 see eks dev's comment on https://issues.apache.org/jira/browse/LUCENE-3841

 I think it would be useful to see if you can tune tomcat's threadpool
 as he describes.

 separately: Snowball stemmers are currently really ram-expensive for
 stupid reasons.
 each one creates a ton of Among objects, e.g. an EnglishStemmer today
 is about 8KB.

 I'll regenerate these and open a JIRA issue: as the snowball code
 generator in their svn was improved
 recently and each one now takes about 64 bytes instead (the Among's
 are static and reused).

 Still this wont really solve your problem, because the analysis
 chain could have other heavy parts
 in initialization, but it seems good to fix.

 As a workaround until then you can also just use the good old
 PorterStemmer (PorterStemFilterFactory in solr).
 Its not exactly the same as using Snowball(English) but its pretty
 close and also much faster.

 --
 lucidimagination.com



Memory leak?? with CloseableThreadLocal with use of Snowball Filter

2012-07-31 Thread roz dev
Hi All

I am using Solr 4 from trunk and using it with Tomcat 6. I am noticing that
when we are indexing lots of data with 16 concurrent threads, Heap grows
continuously. It remains high and ultimately most of the stuff ends up
being moved to Old Gen. Eventually, Old Gen also fills up and we start
getting into excessive GC problem.

I took a heap dump and found that most of the memory is consumed by
CloseableThreadLocal which is holding a WeakHashMap of Threads and its
state.

Most of the old gen is full with ThreadLocal eating up 3GB of heap and heap
dump shows that all such entries are using Snowball Filter. I looked into
LUCENE-3841 and verified that my version of SOLR 4 has that code.

So, I am wondering the reason for this memory leak - is it due to some
other bug with Solr/Lucene?

Here is a brief snapshot of HeapDump showing the problem

Class
Name
| Shallow Heap | Retained Heap
-
*org.apache.solr.schema.IndexSchema$SolrIndexAnalyzer @
0x300c3eb28
|   24 | 3,885,213,072*
|- class class org.apache.solr.schema.IndexSchema$SolrIndexAnalyzer @
0x2f9753340   |0
| 0
|- this$0 org.apache.solr.schema.IndexSchema @
0x300bf4048
|   96 |   276,704
*|- reuseStrategy org.apache.lucene.analysis.Analyzer$PerFieldReuseStrategy
@ 0x300c3eb40  |   16 |
3,885,208,728*
|  |- class class
org.apache.lucene.analysis.Analyzer$PerFieldReuseStrategy @
0x2f98368c0   |0 | 0
|  |- storedValue org.apache.lucene.util.CloseableThreadLocal @
0x300c3eb50   |
24 | 3,885,208,712
|  |  |- class class org.apache.lucene.util.CloseableThreadLocal @
0x2f9788918  |8
| 8
|  |  |- t java.lang.ThreadLocal @
0x300c3eb68
|   16 |16
|  |  |  '- class class java.lang.ThreadLocal @ 0x2f80f0868 System
Class|8
|24
*|  |  |- hardRefs java.util.WeakHashMap @
0x300c3eb78
|   48 | 3,885,208,656*
|  |  |  |- class class java.util.WeakHashMap @ 0x2f8476c00 System
Class|   16
|16
|  |  |  |- table java.util.WeakHashMap$Entry[16] @
0x300c3eba8
|   80 | 2,200,016,960
|  |  |  |  |- class class java.util.WeakHashMap$Entry[] @
0x2f84789e8
|0 | 0
|  |  |  |  |-* [7] java.util.WeakHashMap$Entry @
0x306a24950
|   40 |   318,502,920*
|  |  |  |  |  |- class class java.util.WeakHashMap$Entry @ 0x2f84786f8
System Class|0
| 0
|  |  |  |  |  |- queue java.lang.ref.ReferenceQueue @
0x300c3ebf8
|   32 |48
|  |  |  |  |  |- referent java.lang.Thread @ 0x30678c2c0
web-23
|  112 |   160
|  |  |  |  |  |- value java.util.HashMap @
0x30678cbb0
|   48 |   318,502,880
|  |  |  |  |  |  |- class class java.util.HashMap @ 0x2f80b9428 System
Class   |   24
|24
*|  |  |  |  |  |  |- table java.util.HashMap$Entry[32768] @
0x3c07c6f58   |
131,088 |   318,502,832*
|  |  |  |  |  |  |  |- class class java.util.HashMap$Entry[] @
0x2f80bd9c8 |0
| 0
|  |  |  |  |  |  |  |- [10457] java.util.HashMap$Entry @
0x30678cbe0
|   32 |40,864
|  |  |  |  |  |  |  |  |- class class java.util.HashMap$Entry @
0x2f80bd400 System Class   |0
| 0
|  |  |  |  |  |  |  |  |- key java.lang.String @ 0x30678cc00
prod_desc_keywd_en_CA  |
32 |96
|  |  |  |  |  |  |  |  |- value
org.apache.solr.analysis.TokenizerChain$SolrTokenStreamComponents @
0x30678cc60  |   24 |20,344
|  |  |  |  |  |  |  |  |- next java.util.HashMap$Entry @
0x39a2c9100
|   32 |20,392
|  |  |  |  |  |  |  |  |  |- class class java.util.HashMap$Entry @
0x2f80bd400 System Class|0
| 0
|  |  |  |  |  |  |  |  |  |- key java.lang.String @ 0x39a2c9120
3637994_fr_CA_cat_name_keywd|   32
|   104
|  |  |  |  |  |  |  |  |  |- value
org.apache.solr.analysis.TokenizerChain$SolrTokenStreamComponents @
0x39a2c9188   |   24 |20,256
|  |  |  |  |  |  |  |  |  |  |- class class
org.apache.solr.analysis.TokenizerChain$SolrTokenStreamComponents @
0x2f97a69a0|0 | 0
|  |  |  |  |  |  |  |  |  |  

Re: solr/tomcat stops responding

2012-07-31 Thread roz dev
You are referring to a very old thread

Did you take any heap dump and thread dumo?  They can help you get more
insight.

-Saroj


On Tue, Jul 31, 2012 at 9:04 AM, Suneel pandey.sun...@gmail.com wrote:

 Hello Kevin,

 I am also facing same problem After few hours or  few day my solr server
 getting crash.
 I try  to download following patch but its not accessible now. i am using
 3.1 version of solr.

 http://people.apache.org/~yonik/solr/current/solr.war



 -
 Regards,

 Suneel Pandey
 Sr. Software Developer
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/solr-tomcat-stops-responding-tp474577p3998435.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: too many instances of org.tartarus.snowball.Among in the heap

2012-07-30 Thread roz dev
:132)
at java.lang.Thread.run(Thread.java:662)

   Locked ownable synchronizers:
- None

Agent Heartbeat - Thread t@5
   java.lang.Thread.State: TIMED_WAITING
at java.lang.Thread.sleep(Native Method)
at
com.wily.util.heartbeat.IntervalHeartbeat$HeartbeatRunnable.run(IntervalHeartbeat.java:670)
at java.lang.Thread.run(Thread.java:662)

   Locked ownable synchronizers:
- None

Remove Metric Data Watch Heartbeat Heartbeat - Thread t@7
   java.lang.Thread.State: TIMED_WAITING
at java.lang.Thread.sleep(Native Method)
at
com.wily.util.heartbeat.IntervalHeartbeat$HeartbeatRunnable.run(IntervalHeartbeat.java:670)
at java.lang.Thread.run(Thread.java:662)

   Locked ownable synchronizers:
- None

Configuration Watch Heartbeat Heartbeat - Thread t@6
   java.lang.Thread.State: TIMED_WAITING
at java.lang.Thread.sleep(Native Method)
at
com.wily.util.heartbeat.IntervalHeartbeat$HeartbeatRunnable.run(IntervalHeartbeat.java:670)
at java.lang.Thread.run(Thread.java:662)

   Locked ownable synchronizers:
- None

Signal Dispatcher - Thread t@4
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
- None

Finalizer - Thread t@3
   java.lang.Thread.State: WAITING
at java.lang.Object.wait(Native Method)
- waiting on 48c6254f (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

   Locked ownable synchronizers:
- None

Reference Handler - Thread t@2
   java.lang.Thread.State: WAITING
at java.lang.Object.wait(Native Method)
- waiting on 48bb8adc (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)

   Locked ownable synchronizers:
- None

main - Thread t@1
   java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
- locked 11dacd96 (a java.net.SocksSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:462)
at
com.wily.introscope.agent.probe.net.ManagedServerSocket.com_wily_accept14(ManagedServerSocket.java:362)
at
com.wily.introscope.agent.probe.net.ManagedServerSocket.accept(ManagedServerSocket.java:267)
at
org.apache.catalina.core.StandardServer.await(StandardServer.java:431)
at org.apache.catalina.startup.Catalina.await(Catalina.java:676)
at org.apache.catalina.startup.Catalina.start(Catalina.java:628)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)

   Locked ownable synchronizers:
- None



On Fri, Jul 27, 2012 at 5:19 AM, Alexandre Rafalovitch
arafa...@gmail.comwrote:

 Try taking a couple of thread dumps and see where in the stack the
 snowball classes show up. That might give you a clue.

 Did you customize the parameters to the stemmer? If so, maybe it has
 problems with the file you gave it.

 Just some generic thoughts that might help.

 Regards,
Alex.
 Personal blog: http://blog.outerthoughts.com/
 LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
 - Time is the quality of nature that keeps events from happening all
 at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
 book)


 On Fri, Jul 27, 2012 at 3:53 AM, roz dev rozde...@gmail.com wrote:
  Hi All
 
  I am trying to find out the reason for very high memory use and ran JMAP
  -hist
 
  It is showing that i have too many instances of
 org.tartarus.snowball.Among
 
  Any ideas what is this for and why am I getting so many of them
 
  num   #instances#bytes  Class description
 
 --
  *1:  467281101869124400
  org.tartarus.snowball.Among
  *
  2:  5244210 1840458960  byte[]



Re: too many instances of org.tartarus.snowball.Among in the heap

2012-07-30 Thread roz dev
is it some kind of memory leak with Lucene's use of Snowball Stemmer?

I tried to google for Snowball Stemmer but could not find any recent info
about memory leak

this old link does indicate some memory leak but it is from 2004

http://snowball.tartarus.org/archives/snowball-discuss/0631.html

Any inputs are welcome

-Saroj




On Mon, Jul 30, 2012 at 4:39 PM, roz dev rozde...@gmail.com wrote:

 I did take couple of thread dumps and they seem to be fine

 Heap dump is huge - close to 15GB

 I am having hard time to analyze that heap dump

 2012-07-30 16:07:32
 Full thread dump Java HotSpot(TM) 64-Bit Server VM (19.0-b09 mixed mode):

 RMI TCP Connection(33)-10.8.21.124 - Thread t@190
java.lang.Thread.State: RUNNABLE
 at sun.management.ThreadImpl.dumpThreads0(Native Method)
 at sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:374)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
 com.sun.jmx.mbeanserver.ConvertingMethod.invokeWithOpenReturn(ConvertingMethod.java:167)
 at
 com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:96)
 at
 com.sun.jmx.mbeanserver.MXBeanIntrospector.invokeM2(MXBeanIntrospector.java:33)
 at
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
 at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
 at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
 at javax.management.StandardMBean.invoke(StandardMBean.java:391)
 at
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
 at
 com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
 at
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
 at
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
 at
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
 at
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
 at
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
 at sun.reflect.GeneratedMethodAccessor50.invoke(Unknown Source)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
 at sun.rmi.transport.Transport$1.run(Transport.java:159)
 at java.security.AccessController.doPrivileged(Native Method)
 at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
 at
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
 at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
 at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)

Locked ownable synchronizers:
 - locked 49cbecf2 (a
 java.util.concurrent.locks.ReentrantLock$NonfairSync)

 JMX server connection timeout 189 - Thread t@189
java.lang.Thread.State: TIMED_WAITING
 at java.lang.Object.wait(Native Method)
 - waiting on b75fa27 (a [I)
 at
 com.sun.jmx.remote.internal.ServerCommunicatorAdmin$Timeout.run(ServerCommunicatorAdmin.java:150)
 at java.lang.Thread.run(Thread.java:662)

Locked ownable synchronizers:
 - None

 web-77 - Thread t@186
java.lang.Thread.State: WAITING
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for 5ab03cb6 (a
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
 at
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
 at
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
 at
 java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
 at java.lang.Thread.run(Thread.java:662)

Locked ownable synchronizers:
 - None

 web-76 - Thread t@185
java.lang.Thread.State: WAITING
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for 5ab03cb6 (a
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158

Re: leaks in solr

2012-07-27 Thread roz dev
in my case, I see only 1 searcher, no field cache - still Old Gen is almost
full at 22 GB

Does it have to do with index or some other configuration

-Saroj

On Thu, Jul 26, 2012 at 7:41 PM, Lance Norskog goks...@gmail.com wrote:

 What does the Statistics page in the Solr admin say? There might be
 several searchers open: org.apache.solr.search.SolrIndexSearcher

 Each searcher holds open different generations of the index. If
 obsolete index files are held open, it may be old searchers. How big
 are the caches? How long does it take to autowarm them?

 On Thu, Jul 26, 2012 at 6:15 PM, Karthick Duraisamy Soundararaj
 karthick.soundara...@gmail.com wrote:
  Mark,
  We use solr 3.6.0 on freebsd 9. Over a period of time, it
  accumulates lots of space!
 
  On Thu, Jul 26, 2012 at 8:47 PM, roz dev rozde...@gmail.com wrote:
 
  Thanks Mark.
 
  We are never calling commit or optimize with openSearcher=false.
 
  As per logs, this is what is happening
 
 
 openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
 
  --
  But, We are going to use 4.0 Alpha and see if that helps.
 
  -Saroj
 
 
 
 
 
 
 
 
 
 
  On Thu, Jul 26, 2012 at 5:12 PM, Mark Miller markrmil...@gmail.com
  wrote:
 
   I'd take a look at this issue:
   https://issues.apache.org/jira/browse/SOLR-3392
  
   Fixed late April.
  
   On Jul 26, 2012, at 7:41 PM, roz dev rozde...@gmail.com wrote:
  
it was from 4/11/12
   
-Saroj
   
On Thu, Jul 26, 2012 at 4:21 PM, Mark Miller markrmil...@gmail.com
 
   wrote:
   
   
On Jul 26, 2012, at 3:18 PM, roz dev rozde...@gmail.com wrote:
   
Hi Guys
   
I am also seeing this problem.
   
I am using SOLR 4 from Trunk and seeing this issue repeat every
 day.
   
Any inputs about how to resolve this would be great
   
-Saroj
   
   
Trunk from what date?
   
- Mark
   
   
   
   
   
   
   
   
   
   
  
   - Mark Miller
   lucidimagination.com
  
  
  
  
  
  
  
  
  
  
  
  
 



 --
 Lance Norskog
 goks...@gmail.com



too many instances of org.tartarus.snowball.Among in the heap

2012-07-27 Thread roz dev
Hi All

I am trying to find out the reason for very high memory use and ran JMAP
-hist

It is showing that i have too many instances of org.tartarus.snowball.Among

Any ideas what is this for and why am I getting so many of them

num   #instances#bytes  Class description
--
*1:  467281101869124400  org.tartarus.snowball.Among
*
2:  5244210 1840458960  byte[]
3:  526519495969839368  char[]
4:  10008928864769280   int[]
5:  10250527410021080
java.util.LinkedHashMap$Entry
6:  4672811 268474232   org.tartarus.snowball.Among[]
*7:  8072312 258313984   java.util.HashMap$Entry*
8:  466514  246319392   org.apache.lucene.util.fst.FST$Arc[]
9:  1828542 237600432   java.util.HashMap$Entry[]
10: 3834312 153372480   java.util.TreeMap$Entry
11: 2684700 128865600
org.apache.lucene.util.fst.Builder$UnCompiledNode
12: 4712425 113098200   org.apache.lucene.util.BytesRef
13: 3484836 111514752   java.lang.String
14: 2636045 105441800   org.apache.lucene.index.FieldInfo
15: 1813561 101559416   java.util.LinkedHashMap
16: 6291619 100665904   java.lang.Integer
17: 2684700 85910400
org.apache.lucene.util.fst.Builder$Arc
18: 956998  84215824
org.apache.lucene.index.TermsHashPerField
19: 2892957 69430968
org.apache.lucene.util.AttributeSource$State
20: 2684700 64432800
org.apache.lucene.util.fst.Builder$Arc[]
21: 685595  60332360org.apache.lucene.util.fst.FST
22: 933451  59210944java.lang.Object[]
23: 957043  53594408org.apache.lucene.util.BytesRefHash
24: 591463  42585336
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader
25: 424801  40780896
org.tartarus.snowball.ext.EnglishStemmer
26: 424801  40780896
org.apache.lucene.analysis.miscellaneous.WordDelimiterFilter
27: 1549670 37192080org.apache.lucene.index.Term
28: 849602  33984080
org.apache.lucene.analysis.miscellaneous.WordDelimiterFilter$WordDelimiterConcatenation
29: 424801  27187264
org.apache.lucene.analysis.core.WhitespaceTokenizer
30: 478499  26795944
org.apache.lucene.index.FreqProxTermsWriterPerField
31: 535521  25705008
org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray
32: 219081  24537072
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter
33: 478499  22967952
org.apache.lucene.index.FieldInvertState
34: 956998  22967952
org.apache.lucene.index.TermsHashPerField$PostingsBytesStartArray
35: 478499  22967952
org.apache.lucene.index.TermVectorsConsumerPerField
36: 478499  22967952
org.apache.lucene.index.NormsConsumerPerField
37: 316582  22793904
org.apache.lucene.store.MMapDirectory$MMapIndexInput
38: 906708  21760992
org.apache.lucene.util.AttributeSource$State[]
39: 906708  21760992
org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl
40: 883588  21206112java.util.ArrayList
41: 438192  21033216
org.apache.lucene.store.RAMOutputStream
42: 860601  20654424java.lang.StringBuilder
43: 424801  20390448
org.apache.lucene.analysis.miscellaneous.WordDelimiterIterator
44: 424801  20390448
org.apache.lucene.analysis.core.StopFilter
45: 424801  20390448
org.apache.lucene.analysis.miscellaneous.KeywordMarkerFilter
46: 424801  20390448
org.apache.lucene.analysis.snowball.SnowballFilter
47: 839390  20145360
org.apache.lucene.index.DocumentsWriterDeleteQueue$TermNode


-Saroj


Re: leaks in solr

2012-07-26 Thread roz dev
Hi Guys

I am also seeing this problem.

I am using SOLR 4 from Trunk and seeing this issue repeat every day.

Any inputs about how to resolve this would be great

-Saroj


On Thu, Jul 26, 2012 at 8:33 AM, Karthick Duraisamy Soundararaj 
karthick.soundara...@gmail.com wrote:

 Did you find any more clues? I have this problem in my machines as well..

 On Fri, Jun 29, 2012 at 6:04 AM, Bernd Fehling 
 bernd.fehl...@uni-bielefeld.de wrote:

  Hi list,
 
  while monitoring my solr 3.6.1 installation I recognized an increase of
  memory usage
  in OldGen JVM heap on my slave. I decided to force Full GC from jvisualvm
  and
  send optimize to the already optimized slave index. Normally this helps
  because
  I have monitored this issue over the past. But not this time. The Full GC
  didn't free any memory. So I decided to take a heap dump and see what
  MemoryAnalyzer
  is showing. The heap dump is about 23 GB in size.
 
  1.)
  Report Top consumers - Biggest Objects:
  Total: 12.3 GB
  org.apache.lucene.search.FieldCacheImpl : 8.1 GB
  class java.lang.ref.Finalizer   : 2.1 GB
  org.apache.solr.util.ConcurrentLRUCache : 1.5 GB
  org.apache.lucene.index.ReadOnlySegmentReader : 622.5 MB
  ...
 
  As you can see, Finalizer has already reached 2.1 GB!!!
 
  * java.util.concurrent.ConcurrentHashMap$Segment[16] @ 0x37b056fd0
* segments java.util.concurrent.ConcurrentHashMap @ 0x39b02d268
  * map org.apache.solr.util.ConcurrentLRUCache @ 0x398f33c30
* referent java.lang.ref.Finalizer @ 0x37affa810
  * next java.lang.ref.Finalizer @ 0x37affa838
  ...
 
  Seams to be org.apache.solr.util.ConcurrentLRUCache
  The attributes are:
 
  Type   |Name  | Value
  -
  boolean| isDestroyed  |  true
  -
  ref| cleanupThread|  null
  
  ref| evictionListener |  null
  ---
  long   | oldestEntry  | 0
  --
  int| acceptableWaterMark |  9500
 
 --
  ref| stats| org.apache.solr.util.ConcurrentLRUCache$Stats
  @ 0x37b074dc8
  
  boolean| islive   |  true
  -
  boolean| newThreadForCleanup | false
  
  boolean| isCleaning   | false
 
 
 
  ref| markAndSweepLock | java.util.concurrent.locks.ReentrantLock @
  0x39bf63978
  -
  int| lowerWaterMark   |  9000
  -
  int| upperWaterMark   | 1
  -
  ref|  map | java.util.concurrent.ConcurrentHashMap @
  0x39b02d268
  --
 
 
 
 
  2.)
  While searching for open files and their references I noticed that there
  are references to
  index files which are already deleted from disk.
  E.g. recent index files are data/index/_2iqw.frq and
  data/index/_2iqx.frq.
  But I also see references to data/index/_2hid.frq which are quite old
  and are deleted way back
  from earlier replications.
  I have to analyze this a bit deeper.
 
 
  So far my report, I go on analyzing this huge heap dump.
  If you need any other info or even the heap dump, let me know.
 
 
  Regards
  Bernd
 
 



Re: leaks in solr

2012-07-26 Thread roz dev
it was from 4/11/12

-Saroj

On Thu, Jul 26, 2012 at 4:21 PM, Mark Miller markrmil...@gmail.com wrote:


 On Jul 26, 2012, at 3:18 PM, roz dev rozde...@gmail.com wrote:

  Hi Guys
 
  I am also seeing this problem.
 
  I am using SOLR 4 from Trunk and seeing this issue repeat every day.
 
  Any inputs about how to resolve this would be great
 
  -Saroj


 Trunk from what date?

 - Mark












Re: leaks in solr

2012-07-26 Thread roz dev
Thanks Mark.

We are never calling commit or optimize with openSearcher=false.

As per logs, this is what is happening

openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}

--
But, We are going to use 4.0 Alpha and see if that helps.

-Saroj










On Thu, Jul 26, 2012 at 5:12 PM, Mark Miller markrmil...@gmail.com wrote:

 I'd take a look at this issue:
 https://issues.apache.org/jira/browse/SOLR-3392

 Fixed late April.

 On Jul 26, 2012, at 7:41 PM, roz dev rozde...@gmail.com wrote:

  it was from 4/11/12
 
  -Saroj
 
  On Thu, Jul 26, 2012 at 4:21 PM, Mark Miller markrmil...@gmail.com
 wrote:
 
 
  On Jul 26, 2012, at 3:18 PM, roz dev rozde...@gmail.com wrote:
 
  Hi Guys
 
  I am also seeing this problem.
 
  I am using SOLR 4 from Trunk and seeing this issue repeat every day.
 
  Any inputs about how to resolve this would be great
 
  -Saroj
 
 
  Trunk from what date?
 
  - Mark
 
 
 
 
 
 
 
 
 
 

 - Mark Miller
 lucidimagination.com














Re: Issue with field collapsing in solr 4 while performing distributed search

2012-06-11 Thread roz dev
I think that there is no way around doing custom logic in this case.

If indexing process knows that documents have to be grouped then they
better be together.

-Saroj


On Mon, Jun 11, 2012 at 6:37 AM, Nitesh Nandy niteshna...@gmail.com wrote:

 Martijn,

 How do we add a custom algorithm for distributing documents in Solr Cloud?
 According to this discussion

 http://lucene.472066.n3.nabble.com/SolrCloud-how-to-index-documents-into-a-specific-core-and-how-to-search-against-that-core-td3985262.html
  , Mark discourages users from using custom distribution mechanism in Solr
 Cloud.

 Load balancing is not an issue for us at the moment. In that case, how
 should we implement a custom partitioning algorithm.


 On Mon, Jun 11, 2012 at 6:23 PM, Martijn v Groningen 
 martijn.v.gronin...@gmail.com wrote:

  The ngroups returns the number of groups that have matched with the
  query. However if you want ngroups to be correct in a distributed
  environment you need
  to put document belonging to the same group into the same shard.
  Groups can't cross shard boundaries. I guess you need to do
  some manual document partitioning.
 
  Martijn
 
  On 11 June 2012 14:29, Nitesh Nandy niteshna...@gmail.com wrote:
   Version: Solr 4.0 (svn build 30th may, 2012) with Solr Cloud  (2 slices
  and
   2 shards)
  
   The setup was done as per the wiki:
  http://wiki.apache.org/solr/SolrCloud
  
   We are doing distributed search. While querying, we use field
 collapsing
   with ngroups set as true as we need the number of search results.
  
   However, there is a difference in the number of result list returned
  and
   the ngroups value returned.
  
   Ex:
  
 
 http://localhost:8983/solr/select?q=message:blah%20AND%20userid:3group=truegroup.field=idgroup.ngroups=true
  
  
   The response XMl looks like
  
   response
   script/
   lst name=responseHeader
   int name=status0/int
   int name=QTime46/int
   lst name=params
   str name=group.fieldid/str
   str name=group.ngroupstrue/str
   str name=grouptrue/str
   str name=qmessagebody:monit AND usergroupid:3/str
   /lst
   /lst
   lst name=grouped
   lst name=id
   int name=matches10/int
   int name=ngroups9/int
   arr name=groups
   lst
   str name=groupValue320043/str
   result name=doclist numFound=1 start=0
   doc.../doc
   /result
   /lst
   lst
   str name=groupValue398807/str
   result name=doclist numFound=5 start=0 maxScore=2.4154348...
   /result
   /lst
   lst
   str name=groupValue346878/str
   result name=doclist numFound=2 start=0.../result
   /lst
   lst
   str name=groupValue346880/str
   result name=doclist numFound=2 start=0.../result
   /lst
   /arr
   /lst
   /lst
   /response
  
   So you can see that the ngroups value returned is 9 and the actual
 number
   of groups returned is 4
  
   Why do we have this discrepancy in the ngroups, matches and actual
 number
   of groups. Is this an open issue ?
  
Any kind of help is appreciated.
  
   --
   Regards,
  
   Nitesh Nandy
 
 
 
  --
  Met vriendelijke groet,
 
  Martijn van Groningen
 



 --
 Regards,

 Nitesh Nandy



Re: How to do custom sorting in Solr?

2012-06-10 Thread roz dev
Hi All


 I have an index which contains a Catalog of Products and Categories, with
 Solr 4.0 from trunk

 Data is organized like this:

 Category: Books

 Sub Category: Programming

 Products:

 Product # 1,  Price: Regular Sort Order:1
 Product # 2,  Price: Markdown, Sort Order:2
 Product # 3   Price: Regular, Sort Order:3
 Product # 4   Price: Regular, Sort Order:4
 
 .
 ...
 Product # 100   Price: Regular, Sort Order:100

 Sub Category: Fiction

 Products:

 Product # 1,  Price: Markdown, Sort Order:1
 Product # 2,  Price: Regular, Sort Order:2
 Product # 3   Price: Regular, Sort Order:3
 Product # 4   Price: Markdown, Sort Order:4
 
 .
 ...
 Product # 70   Price: Regular, Sort Order:70


 I want to query Solr and sort these products within each of the
 sub-category in a such a way that products which are on markdown, are at
 the bottom of the documents list and other products
 which are on regular price, are sorted as per their sort order in their
 sub-category.

 Expected Results are

 Category: Books

 Sub Category: Programming

 Products:

 Product # 1,  Price: Regular Sort Order:1
 Product # 2,  Price: Markdown, Sort Order:101
 Product # 3   Price: Regular, Sort Order:3
 Product # 4   Price: Regular, Sort Order:4
 
 .
 ...
 Product # 100   Price: Regular, Sort Order:100

 Sub Category: Fiction

 Products:

 Product # 1,  Price: Markdown, Sort Order:71
 Product # 2,  Price: Regular, Sort Order:2
 Product # 3   Price: Regular, Sort Order:3
 Product # 4   Price: Markdown, Sort Order:71
 
 .
 ...
 Product # 70   Price: Regular, Sort Order:70


 My query is like this:

 q=*:*fq=category:Books

 What are the options to implement custom sorting and how do I do it?


- Define a Custom Function query?
- Define a Custom Comparator? Or,
- Define a Custom Collector?


 Please let me know the best way to go about it and any pointers to
 customize Solr 4.


Thanks
Saroj


Re: How to do custom sorting in Solr?

2012-06-10 Thread roz dev
Thanks Erik for your quick feedback

When Products are assigned to a category or Sub-Category then they can be
in any order and price type can be regular or markdown.
So, reg and markdown products are intermingled  as per their assignment but
I want to sort them in such a way that we
ensure that all the products which are on markdown are at the bottom of the
list.

I can use these multiple sorts but I realize that they are costly in terms
of heap used, as they are using FieldCache.

I have an index with 2M docs and docs are pretty big. So, I don't want to
use them unless there is no other option.

I am wondering if I can define a custom function query which can be like
this:


   - check if product is on the markdown
   - if yes then change its sort order field to be the max value in the
   given sub-category, say 99
   - else, use the sort order of the product in the sub-category

I have been looking at existing function queries but do not have a good
handle on how to make one of my own.

- Another option could be use a custom sort comparator but I am not sure
about the way it works

Any thoughts?


-Saroj




On Sun, Jun 10, 2012 at 5:02 AM, Erick Erickson erickerick...@gmail.comwrote:

 Skimming this, I two options come to mind:

 1 Simply apply primary, secondary, etc sorts. Something like
   sort=subcategory asc,markdown_or_regular desc,sort_order asc

 2 You could also use grouping to arrange things in groups and sort within
  those groups. This has the advantage of returning some members
  of each of the top N groups in the result set, which makes it easier
 to
  get some of each group rather than having to analyze the whole
 list

 But your example is somewhat contradictory. You say
 products which are on markdown, are at
 the bottom of the documents list

 But in your examples, products on markdown are intermingled

 Best
 Erick

 On Sun, Jun 10, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote:
  Hi All
 
 
  I have an index which contains a Catalog of Products and Categories,
 with
  Solr 4.0 from trunk
 
  Data is organized like this:
 
  Category: Books
 
  Sub Category: Programming
 
  Products:
 
  Product # 1,  Price: Regular Sort Order:1
  Product # 2,  Price: Markdown, Sort Order:2
  Product # 3   Price: Regular, Sort Order:3
  Product # 4   Price: Regular, Sort Order:4
  
  .
  ...
  Product # 100   Price: Regular, Sort Order:100
 
  Sub Category: Fiction
 
  Products:
 
  Product # 1,  Price: Markdown, Sort Order:1
  Product # 2,  Price: Regular, Sort Order:2
  Product # 3   Price: Regular, Sort Order:3
  Product # 4   Price: Markdown, Sort Order:4
  
  .
  ...
  Product # 70   Price: Regular, Sort Order:70
 
 
  I want to query Solr and sort these products within each of the
  sub-category in a such a way that products which are on markdown, are at
  the bottom of the documents list and other products
  which are on regular price, are sorted as per their sort order in their
  sub-category.
 
  Expected Results are
 
  Category: Books
 
  Sub Category: Programming
 
  Products:
 
  Product # 1,  Price: Regular Sort Order:1
  Product # 2,  Price: Markdown, Sort Order:101
  Product # 3   Price: Regular, Sort Order:3
  Product # 4   Price: Regular, Sort Order:4
  
  .
  ...
  Product # 100   Price: Regular, Sort Order:100
 
  Sub Category: Fiction
 
  Products:
 
  Product # 1,  Price: Markdown, Sort Order:71
  Product # 2,  Price: Regular, Sort Order:2
  Product # 3   Price: Regular, Sort Order:3
  Product # 4   Price: Markdown, Sort Order:71
  
  .
  ...
  Product # 70   Price: Regular, Sort Order:70
 
 
  My query is like this:
 
  q=*:*fq=category:Books
 
  What are the options to implement custom sorting and how do I do it?
 
 
 - Define a Custom Function query?
 - Define a Custom Comparator? Or,
 - Define a Custom Collector?
 
 
  Please let me know the best way to go about it and any pointers to
  customize Solr 4.
 
 
  Thanks
  Saroj



Re: How to do custom sorting in Solr?

2012-06-10 Thread roz dev
Yes, these documents have lots of unique values as the same product could
be assigned to lots of other categories and that too, in a different sort
order.

We did some evaluation of heap usage and found that with kind of queries we
generate, heap usage was going up to 24-26 GB. I could trace it to the fact
that
fieldCache is creating an array of 2M size for each of the sort fields.

Since same products are mapped to multiple categories, we incur significant
memory overhead. Therefore, any solve where memory consumption can be
reduced is a good one for me.

In fact, we have situations where same product is mapped to more than 1
sub-category in the same category like


Books
 -- Programming
  - Java in a nutshell
 -- Sale (40% off)
  - Java in a nutshell


So,another thought in my mind is to somehow use second pass collector to
group books appropriately in Programming and Sale categories, with right
sort order.

But, i have no clue about that piece :(

-Saroj


On Sun, Jun 10, 2012 at 4:30 PM, Erick Erickson erickerick...@gmail.comwrote:

 2M docs is actually pretty small. Sorting is sensitive to the number
 of _unique_ values in the sort fields, not necessarily the number of
 documents.

 And sorting only works on fields with a single value (i.e. it can't have
 more than one token after analysis). So for each field you're only talking
 2M values at the vary maximum, assuming that the field in question has
 a unique value per document, which I doubt very much given your
 problem description.

 So with a corpus that size, I'd just try it'.

 Best
 Erick

 On Sun, Jun 10, 2012 at 7:12 PM, roz dev rozde...@gmail.com wrote:
  Thanks Erik for your quick feedback
 
  When Products are assigned to a category or Sub-Category then they can be
  in any order and price type can be regular or markdown.
  So, reg and markdown products are intermingled  as per their assignment
 but
  I want to sort them in such a way that we
  ensure that all the products which are on markdown are at the bottom of
 the
  list.
 
  I can use these multiple sorts but I realize that they are costly in
 terms
  of heap used, as they are using FieldCache.
 
  I have an index with 2M docs and docs are pretty big. So, I don't want to
  use them unless there is no other option.
 
  I am wondering if I can define a custom function query which can be like
  this:
 
 
- check if product is on the markdown
- if yes then change its sort order field to be the max value in the
given sub-category, say 99
- else, use the sort order of the product in the sub-category
 
  I have been looking at existing function queries but do not have a good
  handle on how to make one of my own.
 
  - Another option could be use a custom sort comparator but I am not sure
  about the way it works
 
  Any thoughts?
 
 
  -Saroj
 
 
 
 
  On Sun, Jun 10, 2012 at 5:02 AM, Erick Erickson erickerick...@gmail.com
 wrote:
 
  Skimming this, I two options come to mind:
 
  1 Simply apply primary, secondary, etc sorts. Something like
sort=subcategory asc,markdown_or_regular desc,sort_order asc
 
  2 You could also use grouping to arrange things in groups and sort
 within
   those groups. This has the advantage of returning some members
   of each of the top N groups in the result set, which makes it
 easier
  to
   get some of each group rather than having to analyze the whole
  list
 
  But your example is somewhat contradictory. You say
  products which are on markdown, are at
  the bottom of the documents list
 
  But in your examples, products on markdown are intermingled
 
  Best
  Erick
 
  On Sun, Jun 10, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote:
   Hi All
  
  
   I have an index which contains a Catalog of Products and Categories,
  with
   Solr 4.0 from trunk
  
   Data is organized like this:
  
   Category: Books
  
   Sub Category: Programming
  
   Products:
  
   Product # 1,  Price: Regular Sort Order:1
   Product # 2,  Price: Markdown, Sort Order:2
   Product # 3   Price: Regular, Sort Order:3
   Product # 4   Price: Regular, Sort Order:4
   
   .
   ...
   Product # 100   Price: Regular, Sort Order:100
  
   Sub Category: Fiction
  
   Products:
  
   Product # 1,  Price: Markdown, Sort Order:1
   Product # 2,  Price: Regular, Sort Order:2
   Product # 3   Price: Regular, Sort Order:3
   Product # 4   Price: Markdown, Sort Order:4
   
   .
   ...
   Product # 70   Price: Regular, Sort Order:70
  
  
   I want to query Solr and sort these products within each of the
   sub-category in a such a way that products which are on markdown,
 are at
   the bottom of the documents list and other products
   which are on regular price, are sorted as per their sort order in
 their
   sub-category.
  
   Expected Results are
  
   Category: Books
  
   Sub Category: Programming
  
   Products:
  
   Product # 1,  Price: Regular Sort Order:1
   Product # 2,  Price: Markdown, Sort Order:101
   Product

Is there any performance cost of using lots of OR in the solr query

2012-04-04 Thread roz dev
Hi All,

I am working on an application which makes few solr calls to get the data.

On the high level, We have a requirement like this


   - Make first call to Solr, to get the list of products which are
   children of a given category
   - Make 2nd solr call to get product documents based on a list of product
   ids

2nd query will look like

q=document_type:SKUfq=product_id:(34 OR 45 OR 56 OR 77)

We can have close to 100 product ids in fq.

is there a performance cost of doing these solr calls which have lots of OR?

As per Slide # 41 of Presentation The Seven Deadly Sins of Solr, it is a
bad idea to have these kind of queries.

http://www.slideshare.net/lucenerevolution/hill-jay-7-sins-of-solrpdf

But, It does not become clear the reason it is bad.

Any inputs will be welcome.

Thanks

Saroj


Solr Cloud, Commits and Master/Slave configuration

2012-02-27 Thread roz dev
Hi All,

I am trying to understand features of Solr Cloud, regarding commits and
scaling.


   - If I am using Solr Cloud then do I need to explicitly call commit
   (hard-commit)? Or, a soft commit is okay and Solr Cloud will do the job of
   writing to disk?


   - Do We still need to use  Master/Slave setup to scale searching? If we
   have to use Master/Slave setup then do i need to issue hard-commit to make
   my changes visible to slaves?
   - If I were to use NRT with Master/Slave setup with soft commit then
   will the slave be able to see changes made on master with soft commit?

Any inputs are welcome.

Thanks

-Saroj


Re: hot deploy of newer version of solr schema in production

2012-01-31 Thread roz dev
Thanks Jan for your inputs.

I am keen to know about the way people keep running live sites while there
is a breaking change which calls for complete re-indexing.
we want to build a new index , with new schema (it may take couple of
hours) without impacting live e-commerce site.

any thoughts are welcome

Thanks
Saroj


On Tue, Jan 24, 2012 at 12:21 AM, Jan Høydahl jan@cominvent.com wrote:

 Hi,

 To be able to do a true hot deploy of newer schema without reindexing, you
 must carefully see to that none of your changes are breaking changes. So
 you should test the process on your development machine and make sure it
 works. Adding and deleting fields would work, but not changing the
 field-type or analysis of an existing field. Depending on from/to version,
 you may want to keep the old schema-version number.

 The process is:
 1. Deploy the new schema, including all dependencies such as dictionaries
 2. Do a RELOAD CORE http://wiki.apache.org/solr/CoreAdmin#RELOAD

 My preference is to do a more thorough upgrade of schema including new
 functionality and breaking changes, and then do a full reindex. The
 exception is if my index is huge and the reason for Solr upgrade or schema
 change is to fix a bug, not to use new functionality.

 --
 Jan Høydahl, search solution architect
 Cominvent AS - www.cominvent.com
 Solr Training - www.solrtraining.com

 On 24. jan. 2012, at 01:51, roz dev wrote:

  Hi All,
 
  I need community's feedback about deploying newer versions of solr schema
  into production while existing (older) schema is in use by applications.
 
  How do people perform these things? What has been the learning of people
  about this.
 
  Any thoughts are welcome.
 
  Thanks
  Saroj




hot deploy of newer version of solr schema in production

2012-01-23 Thread roz dev
Hi All,

I need community's feedback about deploying newer versions of solr schema
into production while existing (older) schema is in use by applications.

How do people perform these things? What has been the learning of people
about this.

Any thoughts are welcome.

Thanks
Saroj


Index format difference between 4.0 and 3.4

2011-11-14 Thread roz dev
Hi All,

We are using Solr 1.4.1 in production and are considering an upgrade to
newer version.

It seems that Solr 3.x requires a complete rebuild of index as the format
seems to have changed.

Is Solr 4.0 index file format compatible with Solr 3.x format?

Please advise.

Thanks
Saroj


Re: Production Issue: SolrJ client throwing this error even though field type is not defined in schema

2011-09-30 Thread roz dev
This issue disappeared when we reduced the number of documents which were
being returned from Solr.

Looks to be some issue with Tomcat or Solr, returning truncated responses.

-Saroj


On Sun, Sep 25, 2011 at 9:21 AM, pulkitsing...@gmail.com wrote:

 If I had to give a gentle nudge, I would ask you to validate your schema
 XML file. You can do so by looking for any w3c XML validator website and
 just copy pasting the text there to find out where its malformed.

 Sent from my iPhone

 On Sep 24, 2011, at 2:01 PM, Erick Erickson erickerick...@gmail.com
 wrote:

  You might want to review:
 
  http://wiki.apache.org/solr/UsingMailingLists
 
  There's really not much to go on here.
 
  Best
  Erick
 
  On Wed, Sep 21, 2011 at 12:13 PM, roz dev rozde...@gmail.com wrote:
  Hi All
 
  We are getting this error in our Production Solr Setup.
 
  Message: Element type t_sort must be followed by either attribute
  specifications,  or /.
  Solr version is 1.4.1
 
  Stack trace indicates that solr is returning malformed document.
 
 
  Caused by: org.apache.solr.client.solrj.SolrServerException: Error
  executing query
 at
 org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
 at
 org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
 at
 com.gap.gid.search.impl.SearchServiceImpl.executeQuery(SearchServiceImpl.java:232)
 ... 15 more
  Caused by: org.apache.solr.common.SolrException: parsing error
 at
 org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:140)
 at
 org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:101)
 at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:481)
 at
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
 at
 org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
 ... 17 more
  Caused by: javax.xml.stream.XMLStreamException: ParseError at
  [row,col]:[3,136974]
  Message: Element type t_sort must be followed by either attribute
  specifications,  or /.
 at
 com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:594)
 at
 org.apache.solr.client.solrj.impl.XMLResponseParser.readArray(XMLResponseParser.java:282)
 at
 org.apache.solr.client.solrj.impl.XMLResponseParser.readDocument(XMLResponseParser.java:410)
 at
 org.apache.solr.client.solrj.impl.XMLResponseParser.readDocuments(XMLResponseParser.java:360)
 at
 org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:241)
 at
 org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:125)
 ... 21 more
 



Re: Production Issue: SolrJ client throwing - Element type must be followed by either attribute specifications, or /.

2011-09-22 Thread roz dev
Wanted to update the list with our finding.

We reduced the number of documents which are being retrieved from Solr and
this error did not appear again.
Might be the case that due to high number of documents, solr is returning
incomplete documents.

-Saroj


On Wed, Sep 21, 2011 at 12:13 PM, roz dev rozde...@gmail.com wrote:

 Hi All

 We are getting this error in our Production Solr Setup.

 Message: Element type t_sort must be followed by either attribute 
 specifications,  or /.
 Solr version is 1.4.1

 Stack trace indicates that solr is returning malformed document.


 Caused by: org.apache.solr.client.solrj.SolrServerException: Error executing 
 query
   at 
 org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
   at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
   at 
 com.gap.gid.search.impl.SearchServiceImpl.executeQuery(SearchServiceImpl.java:232)
   ... 15 more
 Caused by: org.apache.solr.common.SolrException: parsing error
   at 
 org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:140)
   at 
 org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:101)
   at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:481)
   at 
 org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
   at 
 org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
   ... 17 more
 Caused by: javax.xml.stream.XMLStreamException: ParseError at 
 [row,col]:[3,136974]
 Message: Element type t_sort must be followed by either attribute 
 specifications,  or /.
   at 
 com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:594)
   at 
 org.apache.solr.client.solrj.impl.XMLResponseParser.readArray(XMLResponseParser.java:282)
   at 
 org.apache.solr.client.solrj.impl.XMLResponseParser.readDocument(XMLResponseParser.java:410)
   at 
 org.apache.solr.client.solrj.impl.XMLResponseParser.readDocuments(XMLResponseParser.java:360)
   at 
 org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:241)
   at 
 org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:125)
   ... 21 more




Production Issue: SolrJ client throwing this error even though field type is not defined in schema

2011-09-21 Thread roz dev
Hi All

We are getting this error in our Production Solr Setup.

Message: Element type t_sort must be followed by either attribute
specifications,  or /.
Solr version is 1.4.1

Stack trace indicates that solr is returning malformed document.


Caused by: org.apache.solr.client.solrj.SolrServerException: Error
executing query
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:95)
at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:118)
at 
com.gap.gid.search.impl.SearchServiceImpl.executeQuery(SearchServiceImpl.java:232)
... 15 more
Caused by: org.apache.solr.common.SolrException: parsing error
at 
org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:140)
at 
org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:101)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:481)
at 
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at 
org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:89)
... 17 more
Caused by: javax.xml.stream.XMLStreamException: ParseError at
[row,col]:[3,136974]
Message: Element type t_sort must be followed by either attribute
specifications,  or /.
at 
com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:594)
at 
org.apache.solr.client.solrj.impl.XMLResponseParser.readArray(XMLResponseParser.java:282)
at 
org.apache.solr.client.solrj.impl.XMLResponseParser.readDocument(XMLResponseParser.java:410)
at 
org.apache.solr.client.solrj.impl.XMLResponseParser.readDocuments(XMLResponseParser.java:360)
at 
org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:241)
at 
org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:125)
... 21 more


cache invalidation in slaves

2011-09-20 Thread roz dev
Hi All

Solr has different types of caches such as filterCache, queryResultCache and
document Cache .
I know that if a commit is done then a new searcher is opened and new caches
are built. And, this makes sense.

What happens when commits are happening on master and slaves are pulling all
the delta updates.

Do slaves trash their cache and rebuild them every time there is a new delta
index updates downloaded to slave?


Thanks
Saroj


q and fq in solr 1.4.1

2011-09-20 Thread roz dev
Hi All

I am sure that q vs fq question has been answered several times.

But, I still have a question which I would like to know the answers for:

if we have a solr query like this

q=*fq=field_1:XYZfq=field_2:ABCsortBy=field_3+asc

How does SolrIndexSearcher fire query in 1.4.1

Will it fire query against whole index first because q=* then filter the
results against field_1 and field_2 or is it in parallel?

and, if we say that get only 20 rows at a time then will solr do following
1) get all the docs (because q is set to *) and sort them by field_3
2) then, filter the results by field_1 and field_2

Or, will it apply sorting after doing the filter?

Please let me know how Solr 1.4.1 works.

Thanks
Saroj


what is the default value of omitNorms and termVectors in solr schema

2011-09-18 Thread roz dev
Hi

As per this document, http://wiki.apache.org/solr/FieldOptionsByUseCase,
omitNorms and termVectors have to be explicitly specified in some cases.

I am wondering what is the default value of these settings if solr schema
definition does not state them.

*Example:*

field name=ql_path type=string indexed=false stored=true/

In above case, will Solr create norms for this field and term vector as
well?

Any ideas?

Thanks
Saroj


Re: Does Solr flush to disk even before ramBufferSizeMB is hit?

2011-08-30 Thread roz dev
Thanks Shawn.

If Solr writes this info to Disk as soon as possible (which is what I am
seeing) then ramBuffer setting seems to be misleading.

Anyone else has any thoughts on this?

-Saroj


On Mon, Aug 29, 2011 at 6:14 AM, Shawn Heisey s...@elyograg.org wrote:

 On 8/28/2011 11:18 PM, roz dev wrote:

 I notice that even though InfoStream does not mention that data is being
 flushed to disk, new segment files were created on the server.
 Size of these files kept growing even though there was enough Heap
 available
 and 856MB Ram was not even used.


 With the caveat that I am not an expert and someone may correct me, I'll
 offer this:  It's been my experience that Solr will write the files that
 constitute stored fields as soon as they are available, because that
 information is always the same and nothing will change in those files based
 on the next chunk of data.

 Thanks,
 Shawn




Does Solr flush to disk even before ramBufferSizeMB is hit?

2011-08-28 Thread roz dev
Hi All,
I am trying to tune ramBufferSizeMB and merge factor for my setup.

So, i enabled Lucene Index Writer's log info stream and started monitoring
Data folder where index files are created.
I started my test with following

Heap: 3GB
Solr 1.4.1,
Index Size = 20 GB,
ramBufferSizeMB=856
Merge Factor=25


I ran my testing with 30 concurrent threads writing to Solr.
My jobs delete 6 (approx) records by issuing a deleteByQuery command and
then proceed to write data.

Commit is done at the end of writing process.

Results are bit surprising for me and I need some help understanding them.

I notice that even though InfoStream does not mention that data is being
flushed to disk, new segment files were created on the server.
Size of these files kept growing even though there was enough Heap available
and 856MB Ram was not even used.

Is it the case that Lucene is flushing to disk even if ramBufferSizeMB is
being hit. If that is the case then why is it that
InfoStream is not logging this info.

As per Infostream, it is flushing at the end but files are created much
before that.

Here is what InfoStream is saying: - Please note that is indicating that a
new segment is being flushed at 12:58 AM but files were created at 12:53 am
itself and they kept growing.

Aug 29, 2011 12:46:00 AM IW 0 [main]: setInfoStream:
dir=org.apache.lucene.store.NIOFSDirectory@/opt/gid/solr/ecom/data/index
autoCommit=false
mergePolicy=org.apache.lucene.index.LogByteSizeMergePolicy@4552a64dmergeScheduler=org.apache.lucene.index.ConcurrentMergeScheduler@35242cc9ramBufferSizeMB=856.0
maxBufferedDocs=-1 maxBuffereDeleteTerms=-1
maxFieldLength=1 index=_3l:C2151995

Aug 29, 2011 12:57:35 AM IW 0 [web-1]: now flush at close
Aug 29, 2011 12:57:35 AM IW 0 [web-1]: flush: now pause all indexing threads
Aug 29, 2011 12:57:35 AM IW 0 [web-1]:   flush: segment=_3m
docStoreSegment=_3m docStoreOffset=0 flushDocs=true flushDeletes=true
flushDocStores=true numDocs=60788 numBufDelTerms=60788
Aug 29, 2011 12:57:35 AM IW 0 [web-1]:   index before flush _3l:C2151995
Aug 29, 2011 12:57:35 AM IW 0 [web-1]: DW: flush postings as segment _3m
numDocs=60788
Aug 29, 2011 12:57:35 AM IW 0 [web-1]: DW: closeDocStore: 2 files to flush
to segment _3m numDocs=60788
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=9 total
now 9
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks
blockSize=32768 count=182 total now 182
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=49
total now 49
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=7 total
now 16
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks
blockSize=32768 count=145 total now 327
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=37
total now 86
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=9 total
now 25
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks
blockSize=32768 count=208 total now 535
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=52
total now 138
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=7 total
now 32
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks
blockSize=32768 count=136 total now 671
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=39
total now 177
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleIntBlocks count=3 total
now 35
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleByteBlocks
blockSize=32768 count=58 total now 729
Aug 29, 2011 12:57:40 AM IW 0 [web-1]: DW: DW.recycleCharBlocks count=16
total now 193
Aug 29, 2011 12:57:41 AM IW 0 [web-1]: DW:   oldRAMSize=50469888
newFlushedSize=161169038 docs/MB=395.491 new/old=319.337%
Aug 29, 2011 12:57:41 AM IFD [web-1]: now checkpoint segments_1x [2
segments ; isCommit = false]
Aug 29, 2011 12:57:41 AM IW 0 [web-1]: DW: apply 60788 buffered deleted
terms and 0 deleted docIDs and 1 deleted queries on 2 segments.
Aug 29, 2011 12:57:42 AM IFD [web-1]: now checkpoint segments_1x [2
segments ; isCommit = false]
Aug 29, 2011 12:57:42 AM IFD [web-1]: now checkpoint segments_1x [2
segments ; isCommit = false]
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: LMP: findMerges: 2 segments
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: LMP:   level 6.6799455 to 7.4299455:
1 segments
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: LMP:   level 5.1209826 to 5.8709826:
1 segments
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS: now merge
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS:   index: _3l:C2151995 _3m:C60788
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS:   no more merges pending; now
return
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS: now merge
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS:   index: _3l:C2151995 _3m:C60788
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: CMS:   no more merges pending; now
return
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: now call final commit()
Aug 29, 2011 12:57:42 AM IW 0 [web-1]: startCommit(): start sizeInBytes=0
Aug 29, 2011 12:57:42 AM 

SolrJ Question about Bad Request Root cause error

2011-01-11 Thread roz dev
Hi All

We are using SolrJ client (v 1.4.1) to integrate with our solr search
server.
We notice that whenever SolrJ request does not match with Solr schema, we
get Bad Request exception which makes sense.

org.apache.solr.common.SolrException: Bad Request

But, SolrJ Client does not provide any clue about the reason request is Bad.

Is there any way to get the root cause on client side?

Of Course, solr server logs have enough info to know that data is bad but it
would be great
to have the same info in the exception generated by SolrJ.

Any thoughts? Is there any plan to add this in future releases?

Thanks,
Saroj