date:20110216

Re: Errors when implementing VelocityResponseWriter

2011-02-16 Thread Erik Hatcher

Well, you need to specify a path, relative or absolute, that points to the 
directory where the Velocity JAR file resides.

I'm not sure, at this point, exactly what you're missing.  But it should be 
fairly straightforward.  Solr startup logs the libraries it loads, so maybe 
that is helpful info.  

1.4.1 - does it support lib?  (I'm not sure off the top of my head)

Erik

On Feb 15, 2011, at 12:04 , McGibbney, Lewis John wrote:

 Hi Erik thank you for the reply
 
 I have placed all velocity jar files in my /lib directory. As explained 
 below, I have added relevant configuration to solrconfig.xml, I am just 
 wondering if the config instructions in the wiki are missing something? Can 
 anyone advise on this.
 
 As you mentioned, my terminal output suggests that the VelocityResponseWriter 
 class is not present and therefore the velocity jar is not present... however 
 this is not the case.
 
 I have specified lib dir=./lib / in solrconfig.xml, is this enough or do 
 I need to use an exact path. I have already tried specifying an exact path 
 and it does not seem to work either.
 
 Thank you
 
 Lewis
 
 From: Erik Hatcher [erik.hatc...@gmail.com]
 Sent: 15 February 2011 06:48
 To: solr-user@lucene.apache.org
 Subject: Re: Errors when implementing VelocityResponseWriter
 
 looks like you're missing the Velocity JAR.  It needs to be in some Solr 
 visible lib directory.  With 1.4.1 you'll need to put it in solr-home/lib.  
 In later versions, you can use the lib elements in solrconfig.xml to point 
 to other directories.
 
Erik
 
 On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote:
 
 Hello List,
 
 I am currently trying to implement the above in Solr 1.4.1. Having moved 
 velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my 
 webapp /lib directory, then adding queryResponseWriter name=blah and 
 class=blah followed by the responseHandler specifics I am shown the 
 following terminal output. I also added lib dir=./lib / in solrconfig. 
 Can anyone suggest what I have not included in the config that is still 
 required?
 
 Thanks Lewis
 
 SEVERE: org.apache.solr.common.SolrException: Error loading class 
 'org.apache.solr.response.VelocityResponseWriter'
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
   at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435)
   at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498)
   at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492)
   at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525)
   at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:547)
   at 
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
   at 
 org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
   at 
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
   at 
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
   at 
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
   at 
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382)
   at 
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040)
   at 
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.solr.response.VelocityResponseWriter
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
   at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:247)
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359)
   ... 21 more
 
 Glasgow Caledonian University is a registered Scottish charity, number 
 SC021474
 
 Winner: Times Higher Education’s Widening Participation Initiative of the

Re: Question regarding inner entity in dataimporthandler

2011-02-16 Thread Stefan Matheis

Greg,

a few things, i noticed while reading your post:

1) you don't need an field-assignment for fields where the name does
not change, you can just skip that. field column=creationDate
name=creationDate / - just to name one example

2) TemplateTransformer
(http://wiki.apache.org/solr/DataImportHandler#TemplateTransformer)
has no name-attribute, just column and template

3) again TemplateTransformer - never tried it out, but it should
return 'doc' in your case, when ${document.documentId} has no value ..
it should not work like MySQL's CONCAT() which returns null, if at
least one argument is null. but actually i see no reason for using
RegexTransformer?!

4) Your Sub-Entity-Problem is more or less obviously ;) If
${document.categoryId} is empty (regardless an empty string or just
null) your Query is invalid. What will work, wrap the var with Quotes
(select field1, field2 from table where field3 = '$variable') then it
will work .. w/ or w/o an value

Hope that Helps,
Stefan

On Tue, Feb 15, 2011 at 8:13 PM, Greg Georges greg.geor...@biztree.com wrote:
 OK, I think I found some information, supposedly TemplateTransformer will 
 return an empty string if the value of a variable is null. Some people say to 
 use the regex transformer instead, can anyone clarify this? Thanks

 -Original Message-
 From: Greg Georges [mailto:greg.geor...@biztree.com]
 Sent: 15 février 2011 13:38
 To: solr-user@lucene.apache.org
 Subject: Question regarding inner entity in dataimporthandler

 Hello all,

 I have searched the forums for the question I am about to ask, never found 
 any concrete results. This is my case. I am defining the data config file 
 with the document and entity tags. I define with success a basic entity 
 mapped to my mysql database, and I then add some inner entities. The problem 
 I have is with the one-to-one relationship I have between my document 
 entity and its documentcategory entity. In my document table, the 
 documentcategory foreign key is optional. Here is my mapping

 document
                               entity name=document
                                               query=select DocumentID, 
 DocumentID as documentId, CreationDate as creationDate, DocumentName as 
 documentName,
                                                      Description as 
 description, DescriptionAbstract as descriptionAbstract,
                                                      Downloads as downloads, 
 Downloads30days as downloads30days, Downloads90days as downloads90days,
                                                      PageViews as pageViews, 
 PageViews30days as PageViews30days, PageViews90days as pageViews90days,
                                                      Bookmarks as bookmarks, 
 Bookmarks30days as bookmarks30days, Bookmarks90days as bookmarks90days,
                                                      DocumentRating as 
 documentRating, DocumentRating30days as documentRating30days, 
 DocumentRating90days as documentRating90days,
                                                      LicenseType as 
 licenseType, BizTreeLibraryDoc as bizTreeLibraryDoc, DocFormat as docFormat, 
 Price as price, CreatedByMemberID as memberId,
                                                      DocumentCategoryID as 
 categoryId, IsFreeDoc as isFreeDoc from document
                                                      
 transformer=TemplateTransformer

                                               field column=id name=id 
 template=doc${document.documentId} /
                                               field column=documentId 
 name=docId/
                                               field column=creationDate 
 name=creationDate /
                                               field column=documentName 
 name=documentName /
                                               field column=description 
 name=description /
                                               field 
 column=descriptionAbstract name=descriptionAbstract /
                                               field column=downloads 
 name=downloads /
                                               field column=downloads30days 
 name=downloads30days /
                                               field column=downloads90days 
 name=downloads90days /
                                               field column=pageViews 
 name=pageViews /
                                               field column=pageViews30days 
 name=pageViews30days /
                                               field column=pageViews90days 
 name=pageViews90days /
                                               field column=bookmarks 
 name=bookmarks /
                                               field column=bookmarks30days 
 name=bookmarks30days /
                                               field column=bookmarks90days 
 name=bookmarks90days /
                                               field column=documentRating 
 name=documentRating /

Re: clustering with tomcat

2011-02-16 Thread Markus Jelsma

On Debian you can edit /etc/default/tomcat6

 hi,
  i am  using  solr1.4  with apache tomcat. to enable the
 clustering feature
 i follow the link
 http://wiki.apache.org/solr/ClusteringComponent
 Plz help me how to   add-Dsolr.clustering.enabled=true to $CATALINA_OPTS.
 after that which steps be will required.

Re: How to use XML parser in DIH for a database?

2011-02-16 Thread Stefan Matheis

What about using
http://wiki.apache.org/solr/DataImportHandler#XPathEntityProcessor ?

On Wed, Feb 16, 2011 at 10:08 AM, Bill Bell billnb...@gmail.com wrote:
 I am using DIH.

 I am trying to take a column in a SQL Server database that returns an XML
 string and use Xpath to get data out of it.

 I noticed that Xpath works with external files, how do I get it to work with
 a database?

 I need something like //insur[5][@name='Blue Cross']

 Thanks.

Re: clustering with tomcat

2011-02-16 Thread Isha Garg


On Wednesday 16 February 2011 02:41 PM, Markus Jelsma wrote:

On Debian you can edit /etc/default/tomcat6

   

hi,
  i am  using  solr1.4  with apache tomcat. to enable the
clustering feature
i follow the link
http://wiki.apache.org/solr/ClusteringComponent
Plz help me how to   add-Dsolr.clustering.enabled=true to $CATALINA_OPTS.
after that which steps be will required.
 

i did nt understand  can u plz elaborate how to do this

Re: clustering with tomcat

2011-02-16 Thread Markus Jelsma

What distro are you using? On at least Debian systems you can put the -
Dsolr.clustering.enabled=true environment variable in /etc/default/tomcat6.

You can also, of course, remove all occurences of ${solr.clustering.enabled} 
from you solrconfig.xml

On Wednesday 16 February 2011 10:52:35 Isha Garg wrote:
 On Wednesday 16 February 2011 02:41 PM, Markus Jelsma wrote:
  On Debian you can edit /etc/default/tomcat6
  
  hi,
  
i am  using  solr1.4  with apache tomcat. to enable the
  
  clustering feature
  i follow the link
  http://wiki.apache.org/solr/ClusteringComponent
  Plz help me how to   add-Dsolr.clustering.enabled=true to
  $CATALINA_OPTS. after that which steps be will required.
 
 i did nt understand  can u plz elaborate how to do this

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: clustering with tomcat

2011-02-16 Thread Isha Garg


On Wednesday 16 February 2011 03:32 PM, Markus Jelsma wrote:

What distro are you using? On at least Debian systems you can put the -
Dsolr.clustering.enabled=true environment variable in /etc/default/tomcat6.

You can also, of course, remove all occurences of ${solr.clustering.enabled}
from you solrconfig.xml

On Wednesday 16 February 2011 10:52:35 Isha Garg wrote:
   

On Wednesday 16 February 2011 02:41 PM, Markus Jelsma wrote:
 

On Debian you can edit /etc/default/tomcat6

   

hi,

   i am  using  solr1.4  with apache tomcat. to enable the

clustering feature
i follow the link
http://wiki.apache.org/solr/ClusteringComponent
Plz help me how to   add-Dsolr.clustering.enabled=true to
$CATALINA_OPTS. after that which steps be will required.
 

i did nt understand  can u plz elaborate how to do this
 


I have embed solr with  apache-tomcat5.5 on linux i am getting error

HTTP Status 500 - Severe errors in solr configuration. Check your log 
files for more detailed information on what may be wrong. If you want 
solr to continue after configuration errors, change: 
abortOnConfigurationErrorfalse/abortOnConfigurationError in null 
- 
java.lang.NoSuchMethodError: 
org.carrot2.util.pool.SoftUnboundedPool.init(Lorg/carrot2/util/pool/IInstantiationListener;Lorg/carrot2/util/pool/IActivationListener;Lorg/carrot2/util/pool/IPassivationListener;Lorg/carrot2/util/pool/IDisposalListener;)V 
at org.carrot2.core.CachingController.init(CachingController.java:189) 
at org.carrot2.core.CachingController.init(CachingController.java:115) 
at 
org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.init(CarrotClusteringEngine.java:94) 
at 
org.apache.solr.handler.clustering.ClusteringComponent.inform(ClusteringComponent.java:123) 
at 
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:486) 
at org.apache.solr.core.SolrCore.init(SolrCore.java:588) at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137) 
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83) 
at 
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221) 
at 
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302) 
at 
org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:78) 
at 
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3666) 
at 
org.apache.catalina.core.StandardContext.start(StandardContext.java:4258) at 
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760) 
at 
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740) 
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) 
at 
org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:980) 
at 
org.apache.catalina.startup.HostConfig.deployDirectories(HostConfig.java:943) 
at 
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:500) 
at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1203) at 
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:319) 
at 
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120) 
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) 
at org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at 
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at 
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) 
at 
org.apache.catalina.core.StandardService.start(StandardService.java:448) 
at 
org.apache.catalina.core.StandardServer.start(StandardServer.java:700) 
at org.apache.catalina.startup.Catalina.start(Catalina.java:552) at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
at java.lang.reflect.Method.invoke(Method.java:597) at 
org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at 
org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433)


Now can u tell me what to do I am not familiar with distro and Debian 
systems

Snappull failed

2011-02-16 Thread Markus Jelsma

Hi,

There are a couple of Solr 1.4.1 slaves, all doing the same. Pulling some 
snaps, handling some queries, nothing exciting. But can anyone explain a 
sudden nightly occurence of this error?

2011-02-16 01:23:04,527 ERROR [solr.handler.ReplicationHandler] - [pool-238-
thread-1] - : SnapPull failed 
org.apache.solr.common.SolrException: Unable to download _gv.frq completely. 
Downloaded 209715200!=583644834
at 
org.apache.solr.handler.SnapPuller$FileFetcher.cleanup(SnapPuller.java:1026)
at 
org.apache.solr.handler.SnapPuller$FileFetcher.fetchFile(SnapPuller.java:906)
at 
org.apache.solr.handler.SnapPuller.downloadIndexFiles(SnapPuller.java:541)
at 
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:294)
at 
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

All i know is that it was unable to download but the reason eludes me. 
Sometimes, a machine rolls out many of these errors and increasing the index 
size because it can't handle the already downloaded data.

Cheers,
-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: clustering with tomcat

2011-02-16 Thread Markus Jelsma

I have no idea, seems you haven't compiled Carrot2 or haven't included all 
jars.

On Wednesday 16 February 2011 11:29:30 Isha Garg wrote:
 On Wednesday 16 February 2011 03:32 PM, Markus Jelsma wrote:
  What distro are you using? On at least Debian systems you can put the -
  Dsolr.clustering.enabled=true environment variable in
  /etc/default/tomcat6.
  
  You can also, of course, remove all occurences of
  ${solr.clustering.enabled} from you solrconfig.xml
  
  On Wednesday 16 February 2011 10:52:35 Isha Garg wrote:
  On Wednesday 16 February 2011 02:41 PM, Markus Jelsma wrote:
  On Debian you can edit /etc/default/tomcat6
  
  hi,
  
 i am  using  solr1.4  with apache tomcat. to enable the
  
  clustering feature
  i follow the link
  http://wiki.apache.org/solr/ClusteringComponent
  Plz help me how to   add-Dsolr.clustering.enabled=true to
  $CATALINA_OPTS. after that which steps be will required.
  
  i did nt understand  can u plz elaborate how to do this
 
 I have embed solr with  apache-tomcat5.5 on linux i am getting error
 
 HTTP Status 500 - Severe errors in solr configuration. Check your log
 files for more detailed information on what may be wrong. If you want
 solr to continue after configuration errors, change:
 abortOnConfigurationErrorfalse/abortOnConfigurationError in null
 -
 java.lang.NoSuchMethodError:
 org.carrot2.util.pool.SoftUnboundedPool.init(Lorg/carrot2/util/pool/IInst
 antiationListener;Lorg/carrot2/util/pool/IActivationListener;Lorg/carrot2/u
 til/pool/IPassivationListener;Lorg/carrot2/util/pool/IDisposalListener;)V
 at org.carrot2.core.CachingController.init(CachingController.java:189) at
 org.carrot2.core.CachingController.init(CachingController.java:115) at
 org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.init(Carr
 otClusteringEngine.java:94) at
 org.apache.solr.handler.clustering.ClusteringComponent.inform(ClusteringCom
 ponent.java:123) at
 org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:486)
 at org.apache.solr.core.SolrCore.init(SolrCore.java:588) at
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.jav
 a:137) at
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
 at
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilte
 rConfig.java:221) at
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFi
 lterConfig.java:302) at
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterCo
 nfig.java:78) at
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3
 666) at
 org.apache.catalina.core.StandardContext.start(StandardContext.java:4258)
 at
 org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java
 :760) at
 org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
 at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
 at
 org.apache.catalina.startup.HostConfig.deployDirectory(HostConfig.java:980)
 at
 org.apache.catalina.startup.HostConfig.deployDirectories(HostConfig.java:94
 3) at
 org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:500)
 at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1203) at
 org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:319)
 at
 org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSuppo
 rt.java:120) at
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at
 org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
 org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at
 org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at
 org.apache.catalina.core.StandardService.start(StandardService.java:448)
 at
 org.apache.catalina.core.StandardServer.start(StandardServer.java:700)
 at org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
 sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:3
 9) at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImp
 l.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at
 org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
 org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433)
 
 Now can u tell me what to do I am not familiar with distro and Debian
 systems

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: Solr not Available with Ping when DocBuilder is running

2011-02-16 Thread stockii


my error is, that solr is not reachable with a ping. 
ping over php-HttpRequest ...

-
--- System


One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores  100.000

- Solr1 for Search-Requests - commit every Minute  - 4GB Xmx
- Solr2 for Update-Request  - delta every 2 Minutes - 4GB Xmx
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-not-Available-with-Ping-when-DocBuilder-is-running-tp2500214p2508686.html
Sent from the Solr - User mailing list archive at Nabble.com.

strange search-behavior over dynamic field

2011-02-16 Thread stockii


Hello.

i have the field reason_1 and reason_2. this two fields is in my schema one
dynamicField: dynamicField name=reason_* type=textgen indexed=true
stored=false/

i copy this field in my text-default search field: copyField
source=reason_* dest=text/
And in a new field reason: copyField source=reason_* dest=reason/

--- if i have two documents with the exactly same value in the reason_1
field, solr can only find ONE document, not both.

why ? is it a behavior of solr or a wrong usage of me ? 

-
--- System


One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores  100.000

- Solr1 for Search-Requests - commit every Minute  - 4GB Xmx
- Solr2 for Update-Request  - delta every 2 Minutes - 4GB Xmx
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/strange-search-behavior-over-dynamic-field-tp2508711p2508711.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrException x undefined field

2011-02-16 Thread Leonardo Souza

Hi,

We do have a validation layer for other purposes, but this layer do not know
about the fields and
i would not like to replicate this configuration. Is there any way to query
the solr core about declared fields?

thanks,

[ ]'s
Leonardo da S. Souza
 °v°   Linux user #375225
 /(_)\   http://counter.li.org/
 ^ ^



On Wed, Feb 16, 2011 at 9:16 AM, Savvas-Andreas Moysidis 
savvas.andreas.moysi...@googlemail.com wrote:

 Hi,

 If you have an Application layer and are not directly hitting Solr then
 maybe this functionality could be implemented in Validation layer prior to
 making the Solr call ?

 Cheers,
 - Savvas

 On 16 February 2011 10:23, Leonardo Souza leonardo...@gmail.com wrote:

  Hi,
 
  We are using solr 1.4 in a big project. Now it's time to make some
  improvements.
  We use the standard query parser and we would like to handle the
 misspelled
  field names.
  The problem is that SolrException can not help to flag the problem
  appropriately because
  this exception is used for other problems during the query processing.
 
  I found some clue in SolrException.ErrorCode enumeration but did not
 help.
 
  thanks in advance!
 
  [ ]'s
  Leonardo Souza
   °v°   Linux user #375225
   /(_)\   http://counter.li.org/
   ^ ^

Re: strange search-behavior over dynamic field

2011-02-16 Thread Erick Erickson

What does the admin page show you are the contents of
your index for reason_1?

I suspect you don't really have two documents with the same
value. Perhaps you give them both the same uniqueKey and
one overwrites the other. Perhaps you didn't commit the second.
Perhaps

But you haven't provided enough information to go on here. Where
is the query (don't forget debugQuery=on). Where is the input?

Best
Erick

On Wed, Feb 16, 2011 at 6:26 AM, stockii stock.jo...@googlemail.com wrote:

 Hello.

 i have the field reason_1 and reason_2. this two fields is in my schema one
 dynamicField: dynamicField name=reason_* type=textgen indexed=true
 stored=false/

 i copy this field in my text-default search field: copyField
 source=reason_* dest=text/
 And in a new field reason: copyField source=reason_* dest=reason/

 --- if i have two documents with the exactly same value in the reason_1
 field, solr can only find ONE document, not both.

 why ? is it a behavior of solr or a wrong usage of me ?

 -
 --- System
 

 One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
 1 Core with 31 Million Documents other Cores  100.000

 - Solr1 for Search-Requests - commit every Minute  - 4GB Xmx
 - Solr2 for Update-Request  - delta every 2 Minutes - 4GB Xmx
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/strange-search-behavior-over-dynamic-field-tp2508711p2508711.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrException x undefined field

2011-02-16 Thread Leonardo Souza

Hi Stefan,

LukeRequestHandler could be a good solution, there's a lot of useful info.
This handler works with version 1.4x?

thanks

[ ]'s
Leonardo da S. Souza
 °v°   Linux user #375225
 /(_)\   http://counter.li.org/
 ^ ^



On Wed, Feb 16, 2011 at 10:41 AM, Stefan Matheis 
matheis.ste...@googlemail.com wrote:

 Maybe the http://wiki.apache.org/solr/LukeRequestHandler ?

 On Wed, Feb 16, 2011 at 1:34 PM, Savvas-Andreas Moysidis
 savvas.andreas.moysi...@googlemail.com wrote:
  There is probably a better and more robust way of doing this, but you
 could
  make a request to /solr/admin/file/?file=schema.xml and parse the
 returned
  xml?
 
  Does anyone else know of a better way to query Solr for its schema?
 
  Thanks,
  - Savvas
 
  On 16 February 2011 11:34, Leonardo Souza leonardo...@gmail.com wrote:
 
  Hi,
 
  We do have a validation layer for other purposes, but this layer do not
  know
  about the fields and
  i would not like to replicate this configuration. Is there any way to
 query
  the solr core about declared fields?
 
  thanks,
 
  [ ]'s
  Leonardo da S. Souza
   °v°   Linux user #375225
   /(_)\   http://counter.li.org/
   ^ ^
 
 
 
  On Wed, Feb 16, 2011 at 9:16 AM, Savvas-Andreas Moysidis 
  savvas.andreas.moysi...@googlemail.com wrote:
 
   Hi,
  
   If you have an Application layer and are not directly hitting Solr
 then
   maybe this functionality could be implemented in Validation layer
 prior
  to
   making the Solr call ?
  
   Cheers,
   - Savvas
  
   On 16 February 2011 10:23, Leonardo Souza leonardo...@gmail.com
 wrote:
  
Hi,
   
We are using solr 1.4 in a big project. Now it's time to make some
improvements.
We use the standard query parser and we would like to handle the
   misspelled
field names.
The problem is that SolrException can not help to flag the problem
appropriately because
this exception is used for other problems during the query
 processing.
   
I found some clue in SolrException.ErrorCode enumeration but did not
   help.
   
thanks in advance!
   
[ ]'s
Leonardo Souza
 °v°   Linux user #375225
 /(_)\   http://counter.li.org/
 ^ ^

Spatial Search

2011-02-16 Thread nishant.anand

Hi,
I have very typical problem. From one of my applications I get data in the 
format
add
doc
field name=address Some Address/field
  field name=zipcode1/field
/doc
/add

How can I implement a spatial search for this data?
Any ideas are welcome


Regards,
Nishant Anand



This message and any attachments are solely for the intended recipient and may 
contain Birlasoft confidential or privileged information. If you are not the 
intended 
recipient,any disclosure, copying, use, or distribution of the information 
included in this message and any attachments is prohibited. If you have 
received this communication
 in error, please notify us by reply e-mail(mailad...@birlasoft.com) 
immediately and permanently delete this message and any attachments.
Thank you.

Re: SolrCloud - Example C not working

2011-02-16 Thread Yonik Seeley

On Wed, Feb 16, 2011 at 3:57 AM, Thorsten Scherler scher...@gmail.com wrote:
 On Tue, 2011-02-15 at 09:59 -0500, Yonik Seeley wrote:
 On Mon, Feb 14, 2011 at 8:08 AM, Thorsten Scherler thors...@apache.org 
 wrote:
  Hi all,
 
  I followed http://wiki.apache.org/solr/SolrCloud and everything worked
  fine till I tried Example C:.

 Verified.  I just tried and it failed for me too.

 Hi Yonik, thanks for verifying. :)

 Should I open an issue and move the thread to the dev list?

Yeah, thanks!

-Yonik
http://lucidimagination.com

Re: SolrException x undefined field

2011-02-16 Thread Stefan Matheis

Regarding the Wiki-Page .. since 1.2 .. so, yes, should :)

On Wed, Feb 16, 2011 at 1:55 PM, Leonardo Souza leonardo...@gmail.com wrote:
 Hi Stefan,

 LukeRequestHandler could be a good solution, there's a lot of useful info.
 This handler works with version 1.4x?

 thanks

 [ ]'s
 Leonardo da S. Souza
  °v°   Linux user #375225
  /(_)\   http://counter.li.org/
  ^ ^



 On Wed, Feb 16, 2011 at 10:41 AM, Stefan Matheis 
 matheis.ste...@googlemail.com wrote:

 Maybe the http://wiki.apache.org/solr/LukeRequestHandler ?

 On Wed, Feb 16, 2011 at 1:34 PM, Savvas-Andreas Moysidis
 savvas.andreas.moysi...@googlemail.com wrote:
  There is probably a better and more robust way of doing this, but you
 could
  make a request to /solr/admin/file/?file=schema.xml and parse the
 returned
  xml?
 
  Does anyone else know of a better way to query Solr for its schema?
 
  Thanks,
  - Savvas
 
  On 16 February 2011 11:34, Leonardo Souza leonardo...@gmail.com wrote:
 
  Hi,
 
  We do have a validation layer for other purposes, but this layer do not
  know
  about the fields and
  i would not like to replicate this configuration. Is there any way to
 query
  the solr core about declared fields?
 
  thanks,
 
  [ ]'s
  Leonardo da S. Souza
   °v°   Linux user #375225
   /(_)\   http://counter.li.org/
   ^ ^
 
 
 
  On Wed, Feb 16, 2011 at 9:16 AM, Savvas-Andreas Moysidis 
  savvas.andreas.moysi...@googlemail.com wrote:
 
   Hi,
  
   If you have an Application layer and are not directly hitting Solr
 then
   maybe this functionality could be implemented in Validation layer
 prior
  to
   making the Solr call ?
  
   Cheers,
   - Savvas
  
   On 16 February 2011 10:23, Leonardo Souza leonardo...@gmail.com
 wrote:
  
Hi,
   
We are using solr 1.4 in a big project. Now it's time to make some
improvements.
We use the standard query parser and we would like to handle the
   misspelled
field names.
The problem is that SolrException can not help to flag the problem
appropriately because
this exception is used for other problems during the query
 processing.
   
I found some clue in SolrException.ErrorCode enumeration but did not
   help.
   
thanks in advance!
   
[ ]'s
Leonardo Souza
 °v°   Linux user #375225
 /(_)\   http://counter.li.org/
 ^ ^

Re: Spatial Search

2011-02-16 Thread Stefan Matheis

Nishant,

correct me if i'm wrong .. but spatial search normally requires
geo-information, like latitude and longitude to work? so you would
need to fetch this information before putting them into solr. the
google maps api offers
http://code.google.com/intl/all/apis/maps/documentation/geocoding/#ReverseGeocoding
(for example)

Regards
Stefan

On Wed, Feb 16, 2011 at 1:58 PM,  nishant.an...@birlasoft.com wrote:
 Hi,
 I have very typical problem. From one of my applications I get data in the 
 format
 add
            doc
                        field name=address Some Address/field
      field name=zipcode1/field
 /doc
 /add

 How can I implement a spatial search for this data?
 Any ideas are welcome


 Regards,
 Nishant Anand


 
 This message and any attachments are solely for the intended recipient and 
 may contain Birlasoft confidential or privileged information. If you are not 
 the intended
 recipient,any disclosure, copying, use, or distribution of the information 
 included in this message and any attachments is prohibited. If you have 
 received this communication
  in error, please notify us by reply e-mail(mailad...@birlasoft.com) 
 immediately and permanently delete this message and any attachments.
 Thank you.

Re: Triggering optimise based on time interval

2011-02-16 Thread Stefan Matheis

Renaud,

just because i'm interested in .. what are your concerns about using
cron for that?

Stefan

On Wed, Feb 16, 2011 at 2:12 PM, Renaud Delbru renaud.del...@deri.org wrote:
 Hi,

 We would like to trigger an optimise every x hours. From what I can see,
 there is nothing in Solr (3.1-SNAPSHOT) that enables to do such a thing.
 We have a master-slave configuration. The masters are tuned for fast
 indexing (large merge factor). However, for the moment, the master index is
 replicated as it is to the slaves, and therefore it does not provide very
 fast query time.
 Our idea was
 - to configure the replication so that it only happens after an optimise,
 and
 - schedule a partial optimise in order to reduce the number of segments
 every x hours for faster querying.
 We do not want to rely on cron job for executing the partial optimise every
 x hours, but we would prefer to configure this directly within the solr
 config.

 Our first idea was to create a SolrEventListener, that will be postCommit
 triggered, and that will be in charge of executing an optimise at regular
 time interval. Is this a good approach ? Or is there other solutions to
 achieve this ?

 Thanks,
 --
 Renaud Delbru

Re: strange search-behavior over dynamic field

2011-02-16 Thread stockii


the fieldType is textgen. 

-
--- System


One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores  100.000

- Solr1 for Search-Requests - commit every Minute  - 4GB Xmx
- Solr2 for Update-Request  - delta every 2 Minutes - 4GB Xmx
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/strange-search-behavior-over-dynamic-field-tp2508711p2509166.html
Sent from the Solr - User mailing list archive at Nabble.com.

Triggering optimise based on time interval

2011-02-16 Thread Renaud Delbru


Hi,

We would like to trigger an optimise every x hours. From what I can see, 
there is nothing in Solr (3.1-SNAPSHOT) that enables to do such a thing.
We have a master-slave configuration. The masters are tuned for fast 
indexing (large merge factor). However, for the moment, the master index 
is replicated as it is to the slaves, and therefore it does not provide 
very fast query time.

Our idea was
- to configure the replication so that it only happens after an 
optimise, and
- schedule a partial optimise in order to reduce the number of segments 
every x hours for faster querying.
We do not want to rely on cron job for executing the partial optimise 
every x hours, but we would prefer to configure this directly within the 
solr config.


Our first idea was to create a SolrEventListener, that will be 
postCommit triggered, and that will be in charge of executing an 
optimise at regular time interval. Is this a good approach ? Or is there 
other solutions to achieve this ?


Thanks,
--
Renaud Delbru

Re: Triggering optimise based on time interval

2011-02-16 Thread Renaud Delbru


Mainly technical administration effort.

We are trying to have a solr packaging that
- minimises the effort to deploy the system on a machine.
- reduces errors when deploying
- centralised the logic of the Solr system

Ideally, we would like to have a central place (e.g., solrconfig) where 
the logic of the system is configured.
In that case, the system administrator does not have to bother with a 
long list of tasks and checkpoints every time we need to release a new 
version of the solr system, or extend our clusters. He should just have 
to take the new release, ship it on a machine, and start up solr.

--
Renaud Delbru

On 16/02/11 13:15, Stefan Matheis wrote:

Renaud,

just because i'm interested in .. what are your concerns about using
cron for that?

Stefan

On Wed, Feb 16, 2011 at 2:12 PM, Renaud Delbrurenaud.del...@deri.org  wrote:

Hi,

We would like to trigger an optimise every x hours. From what I can see,
there is nothing in Solr (3.1-SNAPSHOT) that enables to do such a thing.
We have a master-slave configuration. The masters are tuned for fast
indexing (large merge factor). However, for the moment, the master index is
replicated as it is to the slaves, and therefore it does not provide very
fast query time.
Our idea was
- to configure the replication so that it only happens after an optimise,
and
- schedule a partial optimise in order to reduce the number of segments
every x hours for faster querying.
We do not want to rely on cron job for executing the partial optimise every
x hours, but we would prefer to configure this directly within the solr
config.

Our first idea was to create a SolrEventListener, that will be postCommit
triggered, and that will be in charge of executing an optimise at regular
time interval. Is this a good approach ? Or is there other solutions to
achieve this ?

Thanks,
--
Renaud Delbru

Re: Dismax problem

2011-02-16 Thread Yonik Seeley

It looks like you are trying to use a function query on a multi-valued field?

-Yonik
http://lucidimagination.com



On Tue, Feb 15, 2011 at 8:34 AM, Ezequiel Calderara ezech...@gmail.com wrote:
 Hi, im having a problem while trying to do a dismax search.
 For example i have the standard query url like this:
 It returns 1 result.
 But when i try to use the dismax query type i have the following error:

 15/02/2011 10:27:07 org.apache.solr.common.SolrException log
 GRAVE: java.lang.ArrayIndexOutOfBoundsException: 28
     at
 org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:721)
     at
 org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:224)
     at
 org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:692)
     at
 org.apache.solr.search.function.StringIndexDocValues.init(StringIndexDocValues.java:35)
     at
 org.apache.solr.search.function.OrdFieldSource$1.init(OrdFieldSource.java:84)
     at
 org.apache.solr.search.function.OrdFieldSource.getValues(OrdFieldSource.java:58)
     at
 org.apache.solr.search.function.FunctionQuery$AllScorer.init(FunctionQuery.java:123)
     at
 org.apache.solr.search.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:93)
     at
 org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297)
     at
 org.apache.lucene.search.IndexSearcher.searchWithFilter(IndexSearcher.java:268)
     at
 org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:258)
     at org.apache.lucene.search.Searcher.search(Searcher.java:171)
     at
 org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
     at
 org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
     at
 org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
     at
 org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)
     at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:203)
     at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
     at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
     at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
     at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
     at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:242)
     at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
     at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:243)
     at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:201)
     at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:163)
     at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:108)
     at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:556)
     at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
     at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:401)
     at
 org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:281)
     at
 org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:579)
     at
 org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:1568)
     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
 Source)
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
     at java.lang.Thread.run(Unknown Source)


 The Solr instance is running as a replication slave.
 This is the solrconfig.xml: http://pastebin.com/GSv2wBB4
 This is the schema.xml: http://pastebin.com/5VpRT5Jj

 Any help? How can i find what is causing this exception? I thought that the
 dismax didn't throw exceptions...
 --
 __
 Ezequiel.

 Http://www.ironicnet.com

Re: Triggering optimise based on time interval

2011-02-16 Thread Stefan Matheis

hm okay, reasonable :)

never used it, but maybe a pointer into the right direction?
http://wiki.apache.org/solr/DataImportHandler#Scheduling

On Wed, Feb 16, 2011 at 2:27 PM, Renaud Delbru renaud.del...@deri.org wrote:
 Mainly technical administration effort.

 We are trying to have a solr packaging that
 - minimises the effort to deploy the system on a machine.
 - reduces errors when deploying
 - centralised the logic of the Solr system

 Ideally, we would like to have a central place (e.g., solrconfig) where the
 logic of the system is configured.
 In that case, the system administrator does not have to bother with a long
 list of tasks and checkpoints every time we need to release a new version of
 the solr system, or extend our clusters. He should just have to take the new
 release, ship it on a machine, and start up solr.
 --
 Renaud Delbru

 On 16/02/11 13:15, Stefan Matheis wrote:

 Renaud,

 just because i'm interested in .. what are your concerns about using
 cron for that?

 Stefan

 On Wed, Feb 16, 2011 at 2:12 PM, Renaud Delbrurenaud.del...@deri.org
  wrote:

 Hi,

 We would like to trigger an optimise every x hours. From what I can see,
 there is nothing in Solr (3.1-SNAPSHOT) that enables to do such a thing.
 We have a master-slave configuration. The masters are tuned for fast
 indexing (large merge factor). However, for the moment, the master index
 is
 replicated as it is to the slaves, and therefore it does not provide very
 fast query time.
 Our idea was
 - to configure the replication so that it only happens after an optimise,
 and
 - schedule a partial optimise in order to reduce the number of segments
 every x hours for faster querying.
 We do not want to rely on cron job for executing the partial optimise
 every
 x hours, but we would prefer to configure this directly within the solr
 config.

 Our first idea was to create a SolrEventListener, that will be postCommit
 triggered, and that will be in charge of executing an optimise at regular
 time interval. Is this a good approach ? Or is there other solutions to
 achieve this ?

 Thanks,
 --
 Renaud Delbru

Re: strange search-behavior over dynamic field

2011-02-16 Thread stockii


the documents havent the same uniquekey, only reason is the same.

i cannot show the exactly search request, because of privacy policy... 
the query is like that: 
reason_1: firstname lastname, 
reason_2: 1234, 02.02.2011
-- in field reason: firstname lastname, 1234, 02.02.2011

the search request is form an PHP-Application. On my TestEnvironment i
cannot rebuild this case ... =((



okay ... i dont know why, but after a delta-import, its all okay ...

-
--- System


One Server, 12 GB RAM, 2 Solr Instances, 7 Cores, 
1 Core with 31 Million Documents other Cores  100.000

- Solr1 for Search-Requests - commit every Minute  - 4GB Xmx
- Solr2 for Update-Request  - delta every 2 Minutes - 4GB Xmx
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/strange-search-behavior-over-dynamic-field-tp2508711p2509610.html
Sent from the Solr - User mailing list archive at Nabble.com.

CJKAnalyzer and Synonyms

2011-02-16 Thread alexw


Hi everyone,

I am trying to get Synonyms working with CJKAnalyzer. Search works fine but
synonyms do not work as expected. Here is my field definition in the schema
file:

fieldType name=cjk class=solr.TextField
  analyzer class=org.apache.lucene.analysis.cjk.CJKAnalyzer
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true   expand=true/
  /analyzer
/fieldType

When testing on the analysis page, the synonym filter does not kick in at
all. 

My question is:

What am I doing wrong and what is the proper way of defining the field type?

Thanks in advance for your help!

Alex




-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/CJKAnalyzer-and-Synonyms-tp2510104p2510104.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Triggering optimise based on time interval

2011-02-16 Thread Jan Høydahl

I think you can get far by just optimizing how often you do commits (as seldom 
as possible), as well as MergeFactor, to get a good balance between indexing 
and query efficiency. It may be that you're looking for fewer segments on 
average - not always one fully optimized segment.

If you still feel you need more optimizing, the far easiest is to implement the 
logic in your client which sends an explicit optimize whenever your logic 
dictates.

One way to hide this inside Solr config could be to change your MergePolicy in 
solrconfig.xml or implementing your own 
(http://lucene.apache.org/java/3_0_0/api/all/org/apache/lucene/index/MergePolicy.html)
 if you cannot find any suitable ones.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 16. feb. 2011, at 14.12, Renaud Delbru wrote:

 Hi,
 
 We would like to trigger an optimise every x hours. From what I can see, 
 there is nothing in Solr (3.1-SNAPSHOT) that enables to do such a thing.
 We have a master-slave configuration. The masters are tuned for fast indexing 
 (large merge factor). However, for the moment, the master index is replicated 
 as it is to the slaves, and therefore it does not provide very fast query 
 time.
 Our idea was
 - to configure the replication so that it only happens after an optimise, and
 - schedule a partial optimise in order to reduce the number of segments every 
 x hours for faster querying.
 We do not want to rely on cron job for executing the partial optimise every x 
 hours, but we would prefer to configure this directly within the solr config.
 
 Our first idea was to create a SolrEventListener, that will be postCommit 
 triggered, and that will be in charge of executing an optimise at regular 
 time interval. Is this a good approach ? Or is there other solutions to 
 achieve this ?
 
 Thanks,
 -- 
 Renaud Delbru

Re: Are there any restrictions on what kind of how many fields you can use in Pivot Query? I get ClassCastException when I use some of my string fields, and don't when I use some other sting fields

2011-02-16 Thread Tanguy Moal


Hello Ravish, Erick,

I'm facing the same issue with solr-trunk (as of r1071282)

- Field configuration :
fieldType name=normalized_string class=solr.TextField 
positionIncrementGap=100

analyzer
tokenizer class=solr.KeywordTokenizerFactory /
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ASCIIFoldingFilterFactory/
filter class=solr.TrimFilterFactory/
/analyzer
/fieldType

- Schema configuration :
field name=f1 type=normalized_string indexed=true stored=true/
field name=f2 type=normalized_string indexed=true stored=true/
field name=f3 type=normalized_string indexed=true stored=true/

In my test index, I have documents with sparse values : Some documents 
may or may not have a value for f1, f2 and/or f3

The number of indexed documents is around 25.

I'm facing the issue at query time, depending on my query, and the 
temperature of the index.


Parameters having an effect on the reproducibility :
- number of levels of the decision tree : the deeper the tree, the 
faster the exceptions arises
- facet.limit parameter : the higher the limit, the faster the exception 
arises.


Examples :

All docs, facet-pivoting on all fields that matters, varying on 
facet.limit :

q=*:* pivot=f1,f2,f3 facet.limit=1  : OK
q=*:* pivot=f1,f2,f3 facet.limit=2  : OK
...
q=*:* pivot=f1,f2,f3 facet.limit=8  : OK
q=*:* pivot=f1,f2,f3 facet.limit=9  : NOT OK
retry
q=*:* pivot=f1,f2,f3 facet.limit=9  : NOT OK
retry
q=*:* pivot=f1,f2,f3 facet.limit=9  : OK
q=*:* pivot=f1,f2,f3 facet.limit=10  : NOT OK
retry
q=*:* pivot=f1,f2,f3 facet.limit=10  : NOT OK
retry
q=*:* pivot=f1,f2,f3 facet.limit=10  : NOT OK
retry
q=*:* pivot=f1,f2,f3 facet.limit=10  : NOT OK
retry
q=*:* pivot=f1,f2,f3 facet.limit=10  : NOT OK
retry
q=*:* pivot=f1,f2,f3 facet.limit=10  : OK
q=*:* pivot=f1,f2,f3 facet.limit=11  : NOT OK
...

It really looks like a cache issue.
After some retries, I can finally obtain my results, and not an HTTP 500.

Once I obtain my results, I can ask for more, if wait a little.
That's very odd.

So before I continue, here is my query configuration :
query
maxBooleanClauses1024/maxBooleanClauses
filterCache class=solr.FastLRUCache size=512 initialSize=512 
autowarmCount=0/
queryResultCache class=solr.LRUCache size=1024 initialSize=512 
autowarmCount=0/
documentCache class=solr.LRUCache size=1024 initialSize=512 
autowarmCount=0/

enableLazyFieldLoadingtrue/enableLazyFieldLoading
queryResultWindowSize20/queryResultWindowSize
queryResultMaxDocsCached200/queryResultMaxDocsCached
listener event=newSearcher class=solr.QuerySenderListener
arr name=queries
!--
lst str name=qsolr/str str name=start0/str str 
name=rows10/str /lst
lst str name=qrocks/str str name=start0/str str 
name=rows10/str /lst
lststr name=qstatic newSearcher warming query from 
solrconfig.xml/str/lst

  --
/arr
/listener
listener event=firstSearcher class=solr.QuerySenderListener
arr name=queries
lst str name=qsolr rocks/strstr name=start0/strstr 
name=rows10/str/lst
lststr name=qstatic firstSearcher warming query from 
solrconfig.xml/str/lst

/arr
/listener
useColdSearcherfalse/useColdSearcher
maxWarmingSearchers2/maxWarmingSearchers
/query

That's very much like the default configuration.

I guess that the default cache configuration is not perfectly suitable 
for facet pivoting, so any hint on how to tweak it right is welcome.


Kind regards,

--
Tanguy

On 02/15/2011 06:05 PM, Erick Erickson wrote:

To get meaningful help, you have to post a minimum of:
1  the relevant schema definitions for the field that makes it blow
up. include thefieldType  andfield  tags.
2  the query you used, with some indication of the field that makes it blow up.
3  What version you're using
4  any changes you've made to the standard configurations.
5  whether you've recently installed a new version.

It might help if you reviewed: http://wiki.apache.org/solr/UsingMailingLists

Best
Erick

On Tue, Feb 15, 2011 at 11:27 AM, Ravish Bhagdev
ravish.bhag...@gmail.com  wrote:

Looks like its a bug?  Is it not?

Ravish

On Tue, Feb 15, 2011 at 4:03 PM, Ravish Bhagdevravish.bhag...@gmail.comwrote:


When include some of the fields in my search query:

SEVERE: java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to
[Lorg.apache.solr.common.util.ConcurrentLRUCache$CacheEntry;
  at
org.apache.solr.common.util.ConcurrentLRUCache$PQueue.myInsertWithOverflow(ConcurrentLRUCache.java:377)
at
org.apache.solr.common.util.ConcurrentLRUCache.markAndSweep(ConcurrentLRUCache.java:329)
  at
org.apache.solr.common.util.ConcurrentLRUCache.put(ConcurrentLRUCache.java:144)
at org.apache.solr.search.FastLRUCache.put(FastLRUCache.java:131)
  at
org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:904)
at
org.apache.solr.handler.component.PivotFacetHelper.doPivots(PivotFacetHelper.java:121)
  at
org.apache.solr.handler.component.PivotFacetHelper.doPivots(PivotFacetHelper.java:126)
at
org.apache.solr.handler.component.PivotFacetHelper.process(PivotFacetHelper.java:85)

Re: Term Vector Query on Single Document

2011-02-16 Thread Markus Jelsma

On Wednesday 16 February 2011 16:49:51 Tod wrote:
 I have a couple of semi-related questions regarding the use of the Term
 Vector Component:
 
 
 - Using curl is there a way to query a specific document (maybe using
 Tika when required?) to get a distribution of the terms it contains?

No Tika involved here. You can just query a document q=id:whatever and enable 
the TVComponent. Make sure you list your fields in the tv.fl parameter. Those 
fields, of course, need TermVectors enabled.
 
 - When I set the termVector on a field do I need to reindex?  I'm
 thinking 'yes'

Yes.

 
 - How expensive is setting the termVector on a field?

Takes up additional disk space and RAM. Can be a lot.

 
 
 Thanks - Tod

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350

Re: How to use XML parser in DIH for a database?

2011-02-16 Thread Bill Bell

It only works on FileDataSource right ?

Bill Bell
Sent from mobile


On Feb 16, 2011, at 2:17 AM, Stefan Matheis matheis.ste...@googlemail.com 
wrote:

 What about using
 http://wiki.apache.org/solr/DataImportHandler#XPathEntityProcessor ?
 
 On Wed, Feb 16, 2011 at 10:08 AM, Bill Bell billnb...@gmail.com wrote:
 I am using DIH.
 
 I am trying to take a column in a SQL Server database that returns an XML
 string and use Xpath to get data out of it.
 
 I noticed that Xpath works with external files, how do I get it to work with
 a database?
 
 I need something like //insur[5][@name='Blue Cross']
 
 Thanks.

Re: SolrCloud - Example C not working

2011-02-16 Thread Stijn Vanhoorelbeke

2011/2/16 Yonik Seeley yo...@lucidimagination.com

 On Wed, Feb 16, 2011 at 3:57 AM, Thorsten Scherler scher...@gmail.com
 wrote:
  On Tue, 2011-02-15 at 09:59 -0500, Yonik Seeley wrote:
  On Mon, Feb 14, 2011 at 8:08 AM, Thorsten Scherler thors...@apache.org
 wrote:
   Hi all,
  
   I followed http://wiki.apache.org/solr/SolrCloud and everything
 worked
   fine till I tried Example C:.
 
  Verified.  I just tried and it failed for me too.
 
  Hi Yonik, thanks for verifying. :)
 
  Should I open an issue and move the thread to the dev list?

 Yeah, thanks!

 -Yonik
 http://lucidimagination.com


Hi,
For me, example C doesn't work eater. I just tried it - example A  B worked
like a charm,

Stijn Vanhoorelbeke

Re: SolrCloud - Example C not working

2011-02-16 Thread Yonik Seeley

It looks like a log4j issue:

java.lang.NoClassDefFoundError: org/apache/log4j/jmx/HierarchyDynamicMBean
at 
org.apache.zookeeper.jmx.ManagedUtil.registerLog4jMBeans(ManagedUtil.java:51)
at 
org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:114)
at org.apache.solr.cloud.SolrZkServer$1.run(SolrZkServer.java:111)



-Yonik
http://lucidimagination.com



On Wed, Feb 16, 2011 at 11:35 AM, Stijn Vanhoorelbeke
stijn.vanhoorelb...@gmail.com wrote:
 2011/2/16 Yonik Seeley yo...@lucidimagination.com

 On Wed, Feb 16, 2011 at 3:57 AM, Thorsten Scherler scher...@gmail.com
 wrote:
  On Tue, 2011-02-15 at 09:59 -0500, Yonik Seeley wrote:
  On Mon, Feb 14, 2011 at 8:08 AM, Thorsten Scherler
  thors...@apache.org wrote:
   Hi all,
  
   I followed http://wiki.apache.org/solr/SolrCloud and everything
   worked
   fine till I tried Example C:.
 
  Verified.  I just tried and it failed for me too.
 
  Hi Yonik, thanks for verifying. :)
 
  Should I open an issue and move the thread to the dev list?

 Yeah, thanks!

 -Yonik
 http://lucidimagination.com

 Hi,
 For me, example C doesn't work eater. I just tried it - example A  B worked
 like a charm,

 Stijn Vanhoorelbeke

RE: Errors when implementing VelocityResponseWriter

2011-02-16 Thread McGibbney, Lewis John

Managed to get this working. Changed my solrconfig for the one provided in 
velocity dir, repackaged the war file and redeployed on tomcat.

Although this seems like a ridiculously obvious thing to do, I somehow 
overlooked the repackaging aspect, this was where the problem was.

Thanks for the help Erik

From: Erik Hatcher [erik.hatc...@gmail.com]
Sent: 16 February 2011 08:06
To: solr-user@lucene.apache.org
Subject: Re: Errors when implementing VelocityResponseWriter

Well, you need to specify a path, relative or absolute, that points to the 
directory where the Velocity JAR file resides.

I'm not sure, at this point, exactly what you're missing.  But it should be 
fairly straightforward.  Solr startup logs the libraries it loads, so maybe 
that is helpful info.

1.4.1 - does it support lib?  (I'm not sure off the top of my head)

Erik

On Feb 15, 2011, at 12:04 , McGibbney, Lewis John wrote:

 Hi Erik thank you for the reply

 I have placed all velocity jar files in my /lib directory. As explained 
 below, I have added relevant configuration to solrconfig.xml, I am just 
 wondering if the config instructions in the wiki are missing something? Can 
 anyone advise on this.

 As you mentioned, my terminal output suggests that the VelocityResponseWriter 
 class is not present and therefore the velocity jar is not present... however 
 this is not the case.

 I have specified lib dir=./lib / in solrconfig.xml, is this enough or do 
 I need to use an exact path. I have already tried specifying an exact path 
 and it does not seem to work either.

 Thank you

 Lewis
 
 From: Erik Hatcher [erik.hatc...@gmail.com]
 Sent: 15 February 2011 06:48
 To: solr-user@lucene.apache.org
 Subject: Re: Errors when implementing VelocityResponseWriter

 looks like you're missing the Velocity JAR.  It needs to be in some Solr 
 visible lib directory.  With 1.4.1 you'll need to put it in solr-home/lib.  
 In later versions, you can use the lib elements in solrconfig.xml to point 
 to other directories.

Erik

 On Feb 14, 2011, at 10:41 , McGibbney, Lewis John wrote:

 Hello List,

 I am currently trying to implement the above in Solr 1.4.1. Having moved 
 velocity directory from $SOLR_DIST/contrib/velocity/src/main/solr/conf to my 
 webapp /lib directory, then adding queryResponseWriter name=blah and 
 class=blah followed by the responseHandler specifics I am shown the 
 following terminal output. I also added lib dir=./lib / in solrconfig. 
 Can anyone suggest what I have not included in the config that is still 
 required?

 Thanks Lewis

 SEVERE: org.apache.solr.common.SolrException: Error loading class 
 'org.apache.solr.response.VelocityResponseWriter'
   at 
 org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
   at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413)
   at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:435)
   at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1498)
   at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1492)
   at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1525)
   at org.apache.solr.core.SolrCore.initWriters(SolrCore.java:1408)
   at org.apache.solr.core.SolrCore.init(SolrCore.java:547)
   at 
 org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:137)
   at 
 org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:83)
   at 
 org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273)
   at 
 org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254)
   at 
 org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372)
   at 
 org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98)
   at 
 org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4382)
   at 
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5040)
   at 
 org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5035)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.solr.response.VelocityResponseWriter
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
   at

Re: Deploying Solr CORES on OVH Cloud

2011-02-16 Thread Otis Gospodnetic

Hi,

Jetty on Ubuntu has been working well for us and a bunch of our customers.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Rosa (Anuncios) rosaemailanunc...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Tue, February 15, 2011 6:08:39 AM
 Subject: Re: Deploying Solr CORES on OVH Cloud
 
 Thanks for your response, but it doesn't help me a whole lot!
 
 Jetty VS  Tomcat?
 Ubuntu o Debian?
 
 What are the pro of solr  using?
 
 
 
 Le 14/02/2011 23:12, William Bell a écrit :
  The  first two questions are almost like religion. I am not sure we
  want to  start a debate.
 
  Core setup is fairly easy. Add a solr.xml file  and subdirs one per
  core (see example/) directory. Make sure you use the  right URL for the
  admin console.
 
  On Mon, Feb 14, 2011 at  3:38 AM, Rosa (Anuncios)
  rosaemailanunc...@gmail.com   wrote:
  Hi,
 
  I'm a bit new in Solr. I'm  trying to set up a bunch of server (just for
  solr) on OVH cloud (http://www.ovh.co.uk/cloud/) and create new cores as
  needed on  each server.
 
  First question:
 
   What do you recommend: Ubuntu or Debian? I mean in term od  performance?
 
  Second question:
 
   Jetty or Tomcat? Again in term of performance and  security?
 
  Third question:
 
  I've  followed all the wiki but i can't get it working the CORES...
   Impossible to create CORE or access my cores? Does anyone have a  working
  config to share?
 
  Thanks a lot for  your help
 
  Regards,

Re: slave out of sync

2011-02-16 Thread Otis Gospodnetic

Hi Tri,

You could look at the stats page for each slave and compare the number of docs 
in them.  The one(s) that are off from the rest/majority are out of sync.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Tri Nguyen tringuye...@yahoo.com
 To: solr-user@lucene.apache.org
 Sent: Mon, February 14, 2011 7:19:58 PM
 Subject: slave out of sync
 
 Hi,
 
 We're thinking of having a master-slave configuration where there are  
 multiple 

 slaves.  Let's say during replication, one of the slaves does not  replicate 
 properly.
 
 How will we dectect that the 1 slave is out of  sync?
 
 Tri

optimize and mergeFactor

2011-02-16 Thread Jonathan Rochkind

In my own Solr 1.4, I am pretty sure that running an index optimize does 
give me significant better performance. Perhaps because I use some 
largeish (not huge, maybe as large as 200k) stored fields.


So I'm interested in always keeping my index optimized.

Am I right that if I set mergeFactor to '1', essentially my index will 
always be optimized after every commit, and actually running 'optimize' 
will be redundant?


What are the possible negative repurcussions of setting mergeFactor to 
1? Is this a really bad idea?  If not 1, what about some other 
lower-than-usually-recommended value like 2 or 3?  Anyone done this?
I imagine it will slow down my commits, but if the alternative is 
running optimize a lot anyway I wonder at what point I get 'break 
even' (if I optimize after every single commit, clearly might as well 
just set the mergeFactor low, right? But if I optimize after every X 
documents or Y commits don't know what X/Y are break-even).


Jonathan

Re: optimize and mergeFactor

2011-02-16 Thread Markus Jelsma

 In my own Solr 1.4, I am pretty sure that running an index optimize does
 give me significant better performance. Perhaps because I use some
 largeish (not huge, maybe as large as 200k) stored fields.

200.000 stored fields? I asume that number includes your number of documents? 
Sounds crazy =)

 
 So I'm interested in always keeping my index optimized.
 
 Am I right that if I set mergeFactor to '1', essentially my index will
 always be optimized after every commit, and actually running 'optimize'
 will be redundant?

You can set mergeFactor to 2, not lower. 

 
 What are the possible negative repurcussions of setting mergeFactor to
 1? Is this a really bad idea?  If not 1, what about some other
 lower-than-usually-recommended value like 2 or 3?  Anyone done this?
 I imagine it will slow down my commits, but if the alternative is
 running optimize a lot anyway I wonder at what point I get 'break
 even' (if I optimize after every single commit, clearly might as well
 just set the mergeFactor low, right? But if I optimize after every X
 documents or Y commits don't know what X/Y are break-even).

This depends on commit rate and if there are a lot of updates and deletes 
instead of adds. Setting it very low will indeed cause a lot of merging and 
slow commits. It will also be very slow in replication because merged files are 
copied over again and again, causing high I/O on your slaves.

There is always a `break even` but it depends (as usual) on your scenario and 
business demands.

 
 Jonathan

Solr multi cores or not

2011-02-16 Thread Thumuluri, Sai

Hi, 

I have a need to index multiple applications using Solr, I also have the
need to share indexes or run a search query across these application
indexes. Is solr multi-core - the way to go?  My server config is
2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the
recommendation?

Thanks,
Sai Thumuluri

Re: Shutdown hook executing for a long time

2011-02-16 Thread Markus Jelsma

Closing a core will shutdown almost everything related to the workings of a 
core. Update and search handlers, possible warming searchers etc.

Check the implementation of the close method:
http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/solr/src/java/org/apache/solr/core/SolrCore.java?view=markup

 2011-02-16 11:32:45.489::INFO:  Shutdown hook executing
 2011-02-16 11:35:36.002::INFO:  Shutdown hook complete
 
 The shutdown time seems to be proportional to the amount of time that Solr
 has been running.  If I immediately restart and shut down again, it takes
 a fraction of a second.  What causes it to take so long to shut down and
 is there anything I can do to make it happen quicker?

Help with parsing configuration using SolrParams/NamedList

2011-02-16 Thread Jay Luker

Hi,

I'm trying to use a CustomSimilarityFactory and pass in per-field
options from the schema.xml, like so:

 similarity class=org.ads.solr.CustomSimilarityFactory
lst name=field_a
int name=min500/int
int name=max1/int
float name=steepness0.5/float
/lst
lst name=field_b
int name=min500/int
int name=max2/int
float name=steepness0.5/float
/lst
 /similarity

My problem is I am utterly failing to figure out how to parse this
nested option structure within my CustomSimilarityFactory class. I
know that the settings are available as a SolrParams object within the
getSimilarity() method. I'm convinced I need to convert to a NamedList
using params.toNamedList(), but my java fu is too feeble to code the
dang thing. The closest I seem to get is the top level as a NamedList
where the keys are field_a and field_b, but then my values are
strings, e.g., {min=500,max=1,steepness=0.5}.

Anyone who could dash off a quick example of how to do this?

Thanks,
--jay

Re: optimize and mergeFactor

2011-02-16 Thread Jonathan Rochkind


Thanks for the answers, more questions below.

On 2/16/2011 3:37 PM, Markus Jelsma wrote:



200.000 stored fields? I asume that number includes your number of documents?
Sounds crazy =)


Nope, I wasn't clear. I have less than a dozen stored field, but the 
value of a stored field can sometimes be as large as 200kb.




You can set mergeFactor to 2, not lower.


Am I right though that manually running an 'optimize' is the equivalent 
of a mergeFactor=1?  So there's no way to get Solr to keep the index in 
an 'always optimized' state, if I'm understanding correctly? Cool. Just 
want to understand what's going on.



This depends on commit rate and if there are a lot of updates and deletes
instead of adds. Setting it very low will indeed cause a lot of merging and
slow commits. It will also be very slow in replication because merged files are
copied over again and again, causing high I/O on your slaves.

There is always a `break even` but it depends (as usual) on your scenario and
business demands.



There are indeed sadly lots of updates and deletes, which is why I need 
to run optimize periodically. I am aware that this will cause more work 
for replication -- I think this is true whether I manually issue an 
optimize before replication _or_ whether I just keep the mergeFactor 
very low, right? Same issue either way.


So... if I'm going to do lots of updates and deletes, and my other 
option is running an optimize before replication anyway   is there 
any reason it's going to be completely stupid to set the mergeFactor to 
2 on the master?  I realize it'll mean all index files are going to have 
to be replicated, but that would be the case if I ran a manual optimize 
in the same situation before replication too, I think.


Jonathan

Re: Solr multi cores or not

2011-02-16 Thread Jonathan Rochkind

Solr multi-core essentially just lets you run multiple seperate distinct 
Solr indexes in the same running Solr instance.


It does NOT let you run queries accross multiple cores at once. The 
cores are just like completely seperate Solr indexes, they are just 
conveniently running in the same Solr instance. (Which can be easier and 
more compact to set up than actually setting up seperate Solr instances. 
And they can share some config more easily. And it _may_ have 
implications on JVM usage, not sure).


There is no good way in Solr to run a query accross multiple Solr 
indexes, whether they are multi-core or single cores in seperate Solr 
doesn't matter.


Your first approach should be to try and put all the data in one Solr 
index. (one Solr 'core').


Jonathan

On 2/16/2011 3:45 PM, Thumuluri, Sai wrote:

Hi,

I have a need to index multiple applications using Solr, I also have the
need to share indexes or run a search query across these application
indexes. Is solr multi-core - the way to go?  My server config is
2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the
recommendation?

Thanks,
Sai Thumuluri

minimum Solr slave replication config

2011-02-16 Thread Jonathan Rochkind

Solr 1.4.1. So, from the documentation at 
http://wiki.apache.org/solr/SolrReplication


I was wondering if I could get away without having any actual 
configuration in my slave at all. The replication handler is turned on, 
but if I'm going to manually trigger replication pulls while supplying 
the master URL manually with the command too, by:


command=fetchIndexmasterUrl=$solr_master

Then I was thinking, gee, maybe I don't need any slave config at all. 
That _appears_ to not be true. In such a situation, when I tell the 
slave to fetchIndexmasterUrl=$solr_master, the command gives a 200 OK.


But then I go and check /replication?command=details on the slave, I'm 
actually presented with an exception:


message null java.lang.NullPointerException at 
org.apache.solr.handler.ReplicationHandler.isPollingDisabled(ReplicationHandler.java:412) 
at


So I'm thinking this is probably becuase you actually can't get away 
with no slave config at all.


So:

1) Is this a bug? Maybe I did something I shoudn't have, but having 
command=details report a NullPointerException is probably not good, 
right?  If someone who knows better agrees, I'll file it in JIRA?


2) Does anyone know what the minimal slave config is?  If I plan to 
manually trigger replication pulls, and supply the masterUrl maybe 
just an empty lst name=slave/lst.  Or are there other parameters I 
have to set even though I don't plan to use them? (I do not want 
automatic polling, only manually triggered pulls).  Anyone have any 
advice, or should I just trial and error?

Re: optimize and mergeFactor

2011-02-16 Thread Markus Jelsma


 Thanks for the answers, more questions below.
 
 On 2/16/2011 3:37 PM, Markus Jelsma wrote:
  200.000 stored fields? I asume that number includes your number of
  documents? Sounds crazy =)
 
 Nope, I wasn't clear. I have less than a dozen stored field, but the
 value of a stored field can sometimes be as large as 200kb.
 
  You can set mergeFactor to 2, not lower.
 
 Am I right though that manually running an 'optimize' is the equivalent
 of a mergeFactor=1?  So there's no way to get Solr to keep the index in
 an 'always optimized' state, if I'm understanding correctly? Cool. Just
 want to understand what's going on.

That should be it. If i remember correctly a second segment is always written, 
new updates aren't merged immediately. 

 
  This depends on commit rate and if there are a lot of updates and deletes
  instead of adds. Setting it very low will indeed cause a lot of merging
  and slow commits. It will also be very slow in replication because
  merged files are copied over again and again, causing high I/O on your
  slaves.
  
  There is always a `break even` but it depends (as usual) on your scenario
  and business demands.
 
 There are indeed sadly lots of updates and deletes, which is why I need
 to run optimize periodically. I am aware that this will cause more work
 for replication -- I think this is true whether I manually issue an
 optimize before replication _or_ whether I just keep the mergeFactor
 very low, right? Same issue either way.

Yes. But having several segments shouldn't make that much of a difference. If 
search latency is just a few addidional milliseconds than i'd rather have a 
few more segments being copied over more quickly.

 
 So... if I'm going to do lots of updates and deletes, and my other
 option is running an optimize before replication anyway   is there
 any reason it's going to be completely stupid to set the mergeFactor to
 2 on the master?  I realize it'll mean all index files are going to have
 to be replicated, but that would be the case if I ran a manual optimize
 in the same situation before replication too, I think.

No, it's not stupid if you allow for slow indexing and slow copying of files 
but want a very quick search.

 
 Jonathan

RE: Solr multi cores or not

2011-02-16 Thread Bob Sandiford

Hmmm.  Maybe I'm not understanding what you're getting at, Jonathan, when you 
say 'There is no good way in Solr to run a query across multiple Solr indexes'.

What about the 'shards' parameter?  That allows searching across multiple cores 
in the same instance, or shards across multiple instances.

There are certainly implications here (like Relevance not being consistent 
across cores / shards), but it works pretty well for us...

Thanks!

Bob Sandiford | Lead Software Engineer | SirsiDynix
P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com
www.sirsidynix.com 



 -Original Message-
 From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
 Sent: Wednesday, February 16, 2011 4:09 PM
 To: solr-user@lucene.apache.org
 Cc: Thumuluri, Sai
 Subject: Re: Solr multi cores or not
 
 Solr multi-core essentially just lets you run multiple seperate
 distinct
 Solr indexes in the same running Solr instance.
 
 It does NOT let you run queries accross multiple cores at once. The
 cores are just like completely seperate Solr indexes, they are just
 conveniently running in the same Solr instance. (Which can be easier
 and
 more compact to set up than actually setting up seperate Solr
 instances.
 And they can share some config more easily. And it _may_ have
 implications on JVM usage, not sure).
 
 There is no good way in Solr to run a query accross multiple Solr
 indexes, whether they are multi-core or single cores in seperate Solr
 doesn't matter.
 
 Your first approach should be to try and put all the data in one Solr
 index. (one Solr 'core').
 
 Jonathan
 
 On 2/16/2011 3:45 PM, Thumuluri, Sai wrote:
  Hi,
 
  I have a need to index multiple applications using Solr, I also have
 the
  need to share indexes or run a search query across these application
  indexes. Is solr multi-core - the way to go?  My server config is
  2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the
  recommendation?
 
  Thanks,
  Sai Thumuluri

last item in results page is always the same

2011-02-16 Thread Paul

(I'm using solr 1.4)

I'm doing a test of my index, so I'm reading out every document in
batches of 500. The query is (I added newlines here to make it
readable):

http://localhost:8983/solr/archive_ECCO/select/
?q=archive%3AECCO
fl=uri
version=2.2
start=0
rows=500
indent=on
sort=uri%20asc

It turns out, in this case, the query should match every document. The
response shows numFound=182413.

If I scan the returned values, they appear sorted properly except the
last one. In other words, the uri that are returned on the first page
are:

100100
100200
etc...
0006601600
0006601700
1723200600

That 499th value is returned as the 499th value on every page. That
is, if I call it with start=500, then most of the entries look right,
but that last value will still be 1723200600, and the true 499th value
is never returned.

1723200600 should have been returned as the 181,499th item.

Is this a known solr bug or is there something subtle going on?

Thanks,
Paul

Re: last item in results page is always the same

2011-02-16 Thread Yonik Seeley

On Wed, Feb 16, 2011 at 5:08 PM, Paul p...@nines.org wrote:
 Is this a known solr bug or is there something subtle going on?

Yes, I think it's the following bug, fixed in 1.4.1:

* SOLR-1777: fieldTypes with sortMissingLast=true or sortMissingFirst=true can
  result in incorrectly sorted results.

-Yonik
http://lucidimagination.com

Re: Solr multi cores or not

2011-02-16 Thread Jonathan Rochkind

Yes, you're right, from now on when I say that, I'll say except 
shards. It is true.


My understanding is that shards functionality's intended use case is for 
when your index is so large that you want to split it up for 
performance. I think it works pretty well for that, with some 
limitations as you mention.


From reading the list, my impression is that when people try to use 
shards to solve some _other_ problem, they generally run into problems. 
But maybe that's just because the people with the problems are the ones 
who appear on the list?


My personal advice is still to try and put everything together in one 
big index, Solr will give you the least trouble with that, it's what 
Solr likes to do best;  move to shards certainly if your index is so 
large that moving to shards will give you performance advantage you 
need, that's what they're for; be very cautious moving to shards for 
other challenges that 'one big index' is giving you that you're thinking 
shards will solve. Shards is, as I understand it, _not_ intended as a 
general purpose federation function, it's specifically intended to 
split an index accross multiple hosts for performance.


Jonathan

On 2/16/2011 4:37 PM, Bob Sandiford wrote:

Hmmm.  Maybe I'm not understanding what you're getting at, Jonathan, when you 
say 'There is no good way in Solr to run a query across multiple Solr indexes'.

What about the 'shards' parameter?  That allows searching across multiple cores 
in the same instance, or shards across multiple instances.

There are certainly implications here (like Relevance not being consistent 
across cores / shards), but it works pretty well for us...

Thanks!

Bob Sandiford | Lead Software Engineer | SirsiDynix
P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com
www.sirsidynix.com




-Original Message-
From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
Sent: Wednesday, February 16, 2011 4:09 PM
To: solr-user@lucene.apache.org
Cc: Thumuluri, Sai
Subject: Re: Solr multi cores or not

Solr multi-core essentially just lets you run multiple seperate
distinct
Solr indexes in the same running Solr instance.

It does NOT let you run queries accross multiple cores at once. The
cores are just like completely seperate Solr indexes, they are just
conveniently running in the same Solr instance. (Which can be easier
and
more compact to set up than actually setting up seperate Solr
instances.
And they can share some config more easily. And it _may_ have
implications on JVM usage, not sure).

There is no good way in Solr to run a query accross multiple Solr
indexes, whether they are multi-core or single cores in seperate Solr
doesn't matter.

Your first approach should be to try and put all the data in one Solr
index. (one Solr 'core').

Jonathan

On 2/16/2011 3:45 PM, Thumuluri, Sai wrote:

Hi,

I have a need to index multiple applications using Solr, I also have

the

need to share indexes or run a search query across these application
indexes. Is solr multi-core - the way to go?  My server config is
2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the
recommendation?

Thanks,
Sai Thumuluri

Re: Solr multi cores or not

2011-02-16 Thread Markus Jelsma

Hi,

That depends (as usual) on your scenario. Let me ask some questions:

1. what is the sum of documents for your applications?
2. what is the expected load in queries/minute
3. what is the update frequency in documents/minute and how many documents per 
commit?
4. how many different applications do you have?
5. are the query demands for the business the same (or very similar) for all 
applications?
6. can you easily upgrade hardware or demand more machines?
7. must you enforce security between applications and are the clients not 
under your control?

I'm puzzled though, you have so much memory but so little CPU. What about the 
disks? Size? Spinning or SSD?

Cheers,

 Hi,
 
 I have a need to index multiple applications using Solr, I also have the
 need to share indexes or run a search query across these application
 indexes. Is solr multi-core - the way to go?  My server config is
 2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the
 recommendation?
 
 Thanks,
 Sai Thumuluri

Re: Solr multi cores or not

2011-02-16 Thread Markus Jelsma

You can also easily abuse shards to query multiple cores that share parts of 
the schema. This way you have isolation with the ability to query them all. 
The same can, of course, also be achieved using a sinlge index with a simple 
field identying the application and using fq on that one.

 Yes, you're right, from now on when I say that, I'll say except
 shards. It is true.
 
 My understanding is that shards functionality's intended use case is for
 when your index is so large that you want to split it up for
 performance. I think it works pretty well for that, with some
 limitations as you mention.
 
  From reading the list, my impression is that when people try to use
 shards to solve some _other_ problem, they generally run into problems.
 But maybe that's just because the people with the problems are the ones
 who appear on the list?
 
 My personal advice is still to try and put everything together in one
 big index, Solr will give you the least trouble with that, it's what
 Solr likes to do best;  move to shards certainly if your index is so
 large that moving to shards will give you performance advantage you
 need, that's what they're for; be very cautious moving to shards for
 other challenges that 'one big index' is giving you that you're thinking
 shards will solve. Shards is, as I understand it, _not_ intended as a
 general purpose federation function, it's specifically intended to
 split an index accross multiple hosts for performance.
 
 Jonathan
 
 On 2/16/2011 4:37 PM, Bob Sandiford wrote:
  Hmmm.  Maybe I'm not understanding what you're getting at, Jonathan, when
  you say 'There is no good way in Solr to run a query across multiple
  Solr indexes'.
  
  What about the 'shards' parameter?  That allows searching across multiple
  cores in the same instance, or shards across multiple instances.
  
  There are certainly implications here (like Relevance not being
  consistent across cores / shards), but it works pretty well for us...
  
  Thanks!
  
  Bob Sandiford | Lead Software Engineer | SirsiDynix
  P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com
  www.sirsidynix.com
  
  -Original Message-
  From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
  Sent: Wednesday, February 16, 2011 4:09 PM
  To: solr-user@lucene.apache.org
  Cc: Thumuluri, Sai
  Subject: Re: Solr multi cores or not
  
  Solr multi-core essentially just lets you run multiple seperate
  distinct
  Solr indexes in the same running Solr instance.
  
  It does NOT let you run queries accross multiple cores at once. The
  cores are just like completely seperate Solr indexes, they are just
  conveniently running in the same Solr instance. (Which can be easier
  and
  more compact to set up than actually setting up seperate Solr
  instances.
  And they can share some config more easily. And it _may_ have
  implications on JVM usage, not sure).
  
  There is no good way in Solr to run a query accross multiple Solr
  indexes, whether they are multi-core or single cores in seperate Solr
  doesn't matter.
  
  Your first approach should be to try and put all the data in one Solr
  index. (one Solr 'core').
  
  Jonathan
  
  On 2/16/2011 3:45 PM, Thumuluri, Sai wrote:
  Hi,
  
  I have a need to index multiple applications using Solr, I also have
  
  the
  
  need to share indexes or run a search query across these application
  indexes. Is solr multi-core - the way to go?  My server config is
  2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the
  recommendation?
  
  Thanks,
  Sai Thumuluri

Re: solr.HTMLStripCharFilterFactory not working

2011-02-16 Thread Tanner Postert

I updated my data importer.

I used to have:

field column=webtitle stripHTML=true /
field column=webdescription stripHTML=true /

which wasn't working. But I changed that to

field column=webtitle name=webtitle stripHTML=true /
field column=webdescription name=webdescription stripHTML=true /

and it is working fine.

On Tue, Feb 15, 2011 at 5:50 PM, Koji Sekiguchi k...@r.email.ne.jp wrote:

 (11/02/16 8:03), Tanner Postert wrote:

 I am using the data import handler and using the HTMLStripTransformer
 doesn't seem to be working either.

 I've changed webtitle and webdescription to not by copied from title and
 description in the schema.xml file then set them both to just but
 duplicates
 of title and description in the data importer query:

 document name=items
  entity dataSource=db name=item transformer=HTMLStripTransformer
 query=select
   title as title,
   title as webtitle,
   description as description,
   description as webdescription
   FROM ...
   field column=webtitle stripHTML=true /
   field column=webdescription stripHTML=true /
  /entity
 /document


 Just for input (I'm not sure that I could help you), I'm using
 HTMLStripTransformer
 with PlainTextEntityProcessor and it works fine with me:

 dataConfig
  dataSource name=f type=URLDataSource encoding=UTF-8
  baseUrl=http://lucene.apache.org//
  document
entity name=solr processor=PlainTextEntityProcessor
 transformer=HTMLStripTransformer
dataSource=f url=solr/
  field column=plainText name=text stripHTML=true/
/entity
  /document
 /dataConfig

 Koji
 --
 http://www.rondhuit.com/en/

Re: Solr multi cores or not

2011-02-16 Thread Jan Høydahl

I frequently use multiple cores for these reasons:

* Completely different applications, such as web search and directory search
  or if their update latency / query /caching requirements are very different
  I can then also nuke one without affecting the other
  Also, you get nice separation for monitoring each app with e.g. NewRelicRPM
* Two news collections in different languages, and I don't want the TF/IDF
  for overlapping terms between the languages destroy relevancy.
  I then use sharding if we need to return results from both cores

In production we run 3-4 cores on same server without problems. But be aware 
that you have enough memory for the extra caches and a few more Java objects.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 17. feb. 2011, at 00.28, Markus Jelsma wrote:

 You can also easily abuse shards to query multiple cores that share parts of 
 the schema. This way you have isolation with the ability to query them all. 
 The same can, of course, also be achieved using a sinlge index with a simple 
 field identying the application and using fq on that one.
 
 Yes, you're right, from now on when I say that, I'll say except
 shards. It is true.
 
 My understanding is that shards functionality's intended use case is for
 when your index is so large that you want to split it up for
 performance. I think it works pretty well for that, with some
 limitations as you mention.
 
 From reading the list, my impression is that when people try to use
 shards to solve some _other_ problem, they generally run into problems.
 But maybe that's just because the people with the problems are the ones
 who appear on the list?
 
 My personal advice is still to try and put everything together in one
 big index, Solr will give you the least trouble with that, it's what
 Solr likes to do best;  move to shards certainly if your index is so
 large that moving to shards will give you performance advantage you
 need, that's what they're for; be very cautious moving to shards for
 other challenges that 'one big index' is giving you that you're thinking
 shards will solve. Shards is, as I understand it, _not_ intended as a
 general purpose federation function, it's specifically intended to
 split an index accross multiple hosts for performance.
 
 Jonathan
 
 On 2/16/2011 4:37 PM, Bob Sandiford wrote:
 Hmmm.  Maybe I'm not understanding what you're getting at, Jonathan, when
 you say 'There is no good way in Solr to run a query across multiple
 Solr indexes'.
 
 What about the 'shards' parameter?  That allows searching across multiple
 cores in the same instance, or shards across multiple instances.
 
 There are certainly implications here (like Relevance not being
 consistent across cores / shards), but it works pretty well for us...
 
 Thanks!
 
 Bob Sandiford | Lead Software Engineer | SirsiDynix
 P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com
 www.sirsidynix.com
 
 -Original Message-
 From: Jonathan Rochkind [mailto:rochk...@jhu.edu]
 Sent: Wednesday, February 16, 2011 4:09 PM
 To: solr-user@lucene.apache.org
 Cc: Thumuluri, Sai
 Subject: Re: Solr multi cores or not
 
 Solr multi-core essentially just lets you run multiple seperate
 distinct
 Solr indexes in the same running Solr instance.
 
 It does NOT let you run queries accross multiple cores at once. The
 cores are just like completely seperate Solr indexes, they are just
 conveniently running in the same Solr instance. (Which can be easier
 and
 more compact to set up than actually setting up seperate Solr
 instances.
 And they can share some config more easily. And it _may_ have
 implications on JVM usage, not sure).
 
 There is no good way in Solr to run a query accross multiple Solr
 indexes, whether they are multi-core or single cores in seperate Solr
 doesn't matter.
 
 Your first approach should be to try and put all the data in one Solr
 index. (one Solr 'core').
 
 Jonathan
 
 On 2/16/2011 3:45 PM, Thumuluri, Sai wrote:
 Hi,
 
 I have a need to index multiple applications using Solr, I also have
 
 the
 
 need to share indexes or run a search query across these application
 indexes. Is solr multi-core - the way to go?  My server config is
 2virtual CPUs @ 1.8 GHz and has about 32GB of memory. What is the
 recommendation?
 
 Thanks,
 Sai Thumuluri

Re: Migration from Solr 1.2 to Solr 1.4

2011-02-16 Thread Chris Hostetter


:  if you don't have any custom components, you can probably just use
:  your entire solr home dir as is -- just change the solr.war.  (you can't
:  just copy the data dir though, you need to use the same configs)
: 
:  test it out, and note the Upgrading notes in the CHANGES.txt for the
:  1.3, 1.4, and 1.4.1 releases for gotchas that you might wnat to watch
:  out for.

: Thank you for your reply, I've tried to copy the data and configuration
: directory without success :
: SEVERE: Could not start SOLR. Check solr/home property
: java.lang.RuntimeException: org.apache.lucene.index.CorruptIndexException:
: Unknown format version: -10

Hmmm... ok, i'm not sure why that would happen.  According to the 
CAHNGES.txt,  Solr 1.2 used Lucene 2.1 and Solr 1.4.1 used 2.9.3 -- so 
Solr 1.4 should have been able to read an index created by Solr 1.2.

You *could* try upgrading first from 1.2 to 1.3, run an optimize command, 
and then try upgradin from 1.3 to 1.4 -- but i can't make any assertions 
that that will work better, since going straight from 1.2 to 1.4 should 
have worked the same way.

When in doubt: reindex.


-Hoss

Re: Searching for negative numbers very slow

2011-02-16 Thread Chris Hostetter


: This was my first thought but -1 is relatively common but we have other 
: numbers just as common. 

i assume that when you say that you mean ...we have other numbers 
(that are not negative) just as common, (but searching for them is much 
faster) ?

I don't have any insight into why your negative numbers are slower, but 
FWIW...

: Interestingly enough
: 
: fq=uid:-1
: fq=foo:bar
: fq=alpha:omega
: 
: is much (4x) slower than
: 
: q=uid:-1 AND foo:bar AND alpha:omega

...this is (in and of itself) not that suprising for any three arbitrary 
disjoint queries.  when a BoleanQuery is a full disjunction like this (all 
clause required) it can efficiently skip scoring a lot of documents by 
looping over the clauses, asking each one for the next doc they 
match, and then leap frogging the other clauses to that doc.  in the case 
of the three fq params, each query is executd in isolatin, and *all* of 
the matches of each is accounted for.

the speed of using distinct fq params in situations like this comes from 
the reuse after they are in the filterCache -- you can change fq=foo:bar 
to fq=foo:baz on the next query, and still reuse 2/3 of the work that was 
done on the first query. likewise if hte next query is 
fq=uid:-1fq=foo:barfq=alpha:omegabeta then 2/3 of the work is already 
done again, and if a following query is 
fq=uid:-1fq=foo:bazfq=alpha:omegabeta then all of the work is already 
done and cached even though that particular request has never been seen by 
solr.


-Hoss

Re: How to use XML parser in DIH for a database?

2011-02-16 Thread Bill Bell

Does anyone have an example of using this with SQL Server varchar or XML
field?

??

dataConfig
dataSource /
document
entity name=y query=select * from y where xid=${x.id}
entity name=x processor=XPathEntityProcessor
forEach=/the/record/xpath url=${y.xml_name}
field column=full_name xpath=/field/xpath/
/entity
/entity
/document
/dataConfig



On 2/16/11 2:17 AM, Stefan Matheis matheis.ste...@googlemail.com wrote:

What about using
http://wiki.apache.org/solr/DataImportHandler#XPathEntityProcessor ?

On Wed, Feb 16, 2011 at 10:08 AM, Bill Bell billnb...@gmail.com wrote:
 I am using DIH.

 I am trying to take a column in a SQL Server database that returns an
XML
 string and use Xpath to get data out of it.

 I noticed that Xpath works with external files, how do I get it to work
with
 a database?

 I need something like //insur[5][@name='Blue Cross']

 Thanks.

Re: score from two cores

2011-02-16 Thread linkedLetter


A common problem in metasearch engines. Its not intractable. You just have to
surface the right statistics into a 'fusion' scorer.

-
NOT always nice. When are we getting better releases?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/score-from-two-cores-tp2012444p2515617.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Http Connection is hanging while deleteByQuery

2011-02-16 Thread Ravi Kiran

Thanks for updating your solution

On Tue, Feb 8, 2011 at 8:20 AM, shan2812 shanmugaraja...@gmail.com wrote:


 Hi,

 At last the migration to Solr-1.4.1 does solve this issue :-)..

 Cheers
 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Http-Connection-is-hanging-while-deleteByQuery-tp2367405p2451214.html
 Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!

2011-02-16 Thread Ravi Kiran

Thanks for the response Hoss. Sorry for replying late was on a business
trip. The server was indexing as well as searching at the same time and it
was configured for a Native file lock, could that be the issue ? I got
another server so moved it to a Master  slave configuration with file lock
being single on both machines, that solved the issue.

I would however love to know what caused that error (its never too late to
learn, right ???)

Thanks,

Ravi Kiran

On Mon, Feb 7, 2011 at 2:51 PM, Chris Hostetter hossman_luc...@fucit.orgwrote:


 :   While reloading a core I got this following error, when does
 this
 : occur ? Prior to this exception I do not see anything wrong in the logs.

 well, there are realy two distinct types of errors in your log...

 :
 [#|2011-02-01T13:02:36.697-0500|SEVERE|sun-appserver2.1|org.apache.solr.servlet.SolrDispatchFilter|_ThreadID=25;_ThreadName=httpWorkerThread-9001-5;_RequestID=450f6337-1f5c-42bc-a572-f0924de36b56;|org.apache.lucene.store.LockObtainFailedException:
 : Lock obtain timed out: NativeFSLock@
 :
 /data/solr/core/solr-data/index/lucene-7dc773a074342fa21d7d5ba09fc80678-write.lock
 : at org.apache.lucene.store.Lock.obtain(Lock.java:85)
 : at
 org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1565)
 : at
 org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1421)

 ...this is error #1, indicating that for some reason the IndexWriter Solr
 wasn't trying to create wasn't able to get a Native Filesystem lock on
 your index directory -- is it possible you have two intsances of Solr (or
 two solr cores) trying to re-use the same data directory?

 (diagnosing exampley why you got this error also requires knowing what
 Filesystem you are using).

 :
 [#|2011-02-01T13:02:40.330-0500|SEVERE|sun-appserver2.1|org.apache.solr.update.SolrIndexWriter|_ThreadID=82;_ThreadName=Finalizer;_RequestID=121fac59-7b08-46b9-acaa-5c5462418dc7;|SolrIndexWriter
 : was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE
 : LEAK!!!|#]
 :
 :
 [#|2011-02-01T13:02:40.330-0500|SEVERE|sun-appserver2.1|org.apache.solr.update.SolrIndexWriter|_ThreadID=82;_ThreadName=Finalizer;_RequestID=121fac59-7b08-46b9-acaa-5c5462418dc7;|SolrIndexWriter
 : was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE
 : LEAK!!!|#]

 ...these errors are warning you that something very unexpected was
 discovered when the the Garbage Collector tried to cleanup the
 SolrIndexWriter -- it found that the SolrIndexWriter had never been
 formally closed.

 In normal operation, this might indicate the existence of a bug in code
 not managing it's resources properly --and in fact, it does indicate the
 existence of a bug in that evidently a Lock timed out failure doesn't
 cause the SOlrIndexWriter to be closed -- but in your case it's not really
 something to be worried about -- it's just a cascading effect of the first
 error.

 -Hoss

Validate Query Syntax of Solr Request Before Sending

2011-02-16 Thread csj


Hi,

I wonder if it is possible to let the user build up a Solr Query and have it
validated by some java API before sending it to Solr. 

Is there a parser that could help with that? I would like to help the user
building a valid query as she types by showing messages like The query is
not valid or purhaps even more advanced: The parentheses are not
balanced.

Maybe one day it would also be possible to analyse the semantics of the
query like: This query has a build-in inconsistency because the two dates
you have specified requires documents to be before AND after these date.
But this is far future...

Regards,

Christian Sonne Jensen
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Validate-Query-Syntax-of-Solr-Request-Before-Sending-tp2515797p2515797.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to use XML parser in DIH for a database?

2011-02-16 Thread Grijesh


Use a fielddatasource for reading field from database and then use
xpathentityprocessor .Field datasource will give you the stream that is
needed by xpathentity processor.Bellow is the example dih configuration
code.

?xml version=1.0?
dataConfig
  dataSource type=JdbcDataSource 
  driver=oracle.jdbc.driver.OracleDriver 
  url=jdbc:oracle:thin:@localhost:1521:xe 
  user=user 
  password=password
  name=ds/
  dataSource name=fieldSource type=FieldReaderDataSource /
  document
entity name=clobxml dataSource=ds query=select * from tableXX
transformer=ClobTransformer
  field column=ID name=id /
  field column=SUPPLIER_APPROVALS name=supplier_approvals
clob=true/
entity name=xmlread dataSource=fieldSource
processor=XPathEntityProcessor forEach=/suppliers/supplier
dataField=clobxml.SUPPLIER_APPROVALS onError=continue 
  field column=supplier_name xpath=/suppliers/supplier/name
/
  field column=supplier_id xpath=/suppliers/supplier/id /
/entity
/entity
  /document
/dataConfig
 

-
Thanx:
Grijesh
http://lucidimagination.com
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-use-XML-parser-in-DIH-for-a-database-tp2508015p2515910.html
Sent from the Solr - User mailing list archive at Nabble.com.

Is facet could be used for Analytics

2011-02-16 Thread Ganesh

Hello all,

We need to build a Analytics kind of application. Intially we plan to aggregate 
the result and add it to database or use any ETL tool. I have an idea to use 
Facet search. I just want to know others opinion on this.

We require results in the below fashion. Top 3 results in each column.

Top users Country PageAccessed 
UserA (100) India (1000)   /Articles/abc (200)
UserB (100) US(500) /Articles/xyz (200)
UserC (100) Russia(200)/Articles/aaa (100)

When click on particular user, the results should be grouped for that User.
Top users Country PageAccessed 
UserA (100) India (100)   /Articles/abc (55)
US(50) /Articles/xyz (25)
 /Articles/aaa (10)

This is just an example. I think facet search will help to solve this kind of 
issue.

Regards
Ganesh
Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download 
Now! http://messenger.yahoo.com/download.php

Re: Is facet could be used for Analytics

2011-02-16 Thread Grijesh


I thing facet search is good for your requirement. Also what about Result
Grouping feature of Solr ?

-
Thanx:
Grijesh
http://lucidimagination.com
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Is-facet-could-be-used-for-Analytics-tp2515938p2515959.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Searching for negative numbers very slow

2011-02-16 Thread Dennis Gearon

Is it my imagination or has this exact email been on the list already?

 Dennis Gearon


Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.





From: Chris Hostetter hossman_luc...@fucit.org
To: solr-user@lucene.apache.org
Cc: yo...@lucidimagination.com
Sent: Wed, February 16, 2011 6:20:28 PM
Subject: Re: Searching for negative numbers very slow


: This was my first thought but -1 is relatively common but we have other 
: numbers just as common. 

i assume that when you say that you mean ...we have other numbers 
(that are not negative) just as common, (but searching for them is much 
faster) ?

I don't have any insight into why your negative numbers are slower, but 
FWIW...

: Interestingly enough
: 
: fq=uid:-1
: fq=foo:bar
: fq=alpha:omega
: 
: is much (4x) slower than
: 
: q=uid:-1 AND foo:bar AND alpha:omega

...this is (in and of itself) not that suprising for any three arbitrary 
disjoint queries.  when a BoleanQuery is a full disjunction like this (all 
clause required) it can efficiently skip scoring a lot of documents by 
looping over the clauses, asking each one for the next doc they 
match, and then leap frogging the other clauses to that doc.  in the case 
of the three fq params, each query is executd in isolatin, and *all* of 
the matches of each is accounted for.

the speed of using distinct fq params in situations like this comes from 
the reuse after they are in the filterCache -- you can change fq=foo:bar 
to fq=foo:baz on the next query, and still reuse 2/3 of the work that was 
done on the first query. likewise if hte next query is 
fq=uid:-1fq=foo:barfq=alpha:omegabeta then 2/3 of the work is already 
done again, and if a following query is 
fq=uid:-1fq=foo:bazfq=alpha:omegabeta then all of the work is already 
done and cached even though that particular request has never been seen by 
solr.


-Hoss

63 matches

Mail list logo