Re: Custom Cache cleared after a commit?
I guess I'll have to use something other then SolrCache to get what I want then. Or I could use SolrCache and just change the code (I've already done so much of this anwyways...). Anyways thanks for the reply. -- View this message in context: http://lucene.472066.n3.nabble.com/Custom-Cache-cleared-after-a-commit-tp3136345p3136580.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: MergerFactor and MaxMergerDocs effecting num of segments created
Shawn when i reindex data using full-import i got: *_0.fdt 3310 _0.fdx 23 _0.frq 857 _0.nrm 31 _0.prx 1748 _0.tis 350 _1.fdt 3310 _1.fdx 23 _1.fnm 1 _1.frq 857 _1.nrm 31 _1.prx 1748 _1.tii 5 _1.tis 350 segments.gen1 segments_3 1* Where all _1 marked as archived(A) And when i run again full import(for testing ) i got _1 and 2_ files where all 2_ marked as archive. What does it mean. and the problem i am not getting is while i am doing full import which deletes the old indexes and creates new than why i m getting the old one again. - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/MergerFactor-and-MaxMergerDocs-effecting-num-of-segments-created-tp3128897p3136664.html Sent from the Solr - User mailing list archive at Nabble.com.
Payload doesn't apply to WordDelimiterFilterFactory-generated tokens
Hi, I have a problem with the WordDelimiterFilterFactory and the DelimitedPayloadTokenFilterFactory. It seems that the payloads are applied only to the original word that I index and the WordDelimiterFilter doesn't apply the payloads to the tokens it generates. For example, imagine I index the string JavaProject|1.7, at the end of my analyzer pipeline will be transformed like this: JavaProject|1.7 - javaproject|1.7 java project Instead, what I would is a result like this: JavaProject|1.7 - javaproject|1.7 java|1.7 project|1.7 This way the payload would be applied to the document even in case of partial matches on the original word. Now I have used the pipe notation but imagine those payloads already stored in solr internally. How can I do this? If it is needed, my analyzer looks like this: fieldType name=text_C class=solr.TextField positionIncrementGap=100 stored=false indexed=true analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.DelimitedPayloadTokenFilterFactory encoder=float/ filter class=solr.PatternReplaceFilterFactory pattern=^[a-z]{2,5}[0-9]{1,4}?([.]|[a-z])?(.*) replacement= replace=all / filter class=solr.WordDelimiterFilterFactory preserveOriginal=1 generateNumberParts=1/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.TrimFilterFactory / filter class=solr.LowerCaseFilterFactory/ filter class=solr.LengthFilterFactory min=1 max=30 / filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer . . . Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/Payload-doesn-t-apply-to-WordDelimiterFilterFactory-generated-tokens-tp3136748p3136748.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problem in including both clustering component and spellchecker for solr search results at the same time
Markus, i did like it *requestHandler name=search class=solr.SearchHandler default=true lst name=defaults str name=spellcheck.dictionarydefault/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.extendedResultsfalse/str str name=spellcheck.count1/str /lst lst name=appends str name=spellcheck.dictionarydefault/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.extendedResultsfalse/str str name=spellcheck.count1/str /lst * I hope i have done things correctly. But when i run solr server i am getting exception *org.apache.solr.common.SolrException: Unknown Search Component: clusteringComponent* - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-in-including-both-clustering-component-and-spellchecker-for-solr-search-results-at-the-same-e-tp3128864p3136756.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to improve query result time.
how long is an average query? I have noticed, that if the query with such a contents as you specified, it can take a while to return the hits. How big is your index? On Mon, Jul 4, 2011 at 8:48 AM, Jason, Kim hialo...@gmail.com wrote: Hi All I have complex phrase queries including wildcard. (ex. q=conn* pho*~2 OR inter* pho*~2 OR ...) That takes long query result time. I tried reindex after changing termIndexInterval to 8 for reduce the query result time through more loading term index info. I thought if I do so query result time will be faster. But it wasn't. I doubt searching for .frq/.prx spends more time... Any ideas for impoving query result time? I'm using Solr 1.4 and schema.xml is below. fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=1 splitOnCaseChange=1 / filter class=solr.LowerCaseFilterFactory / filter class=solr.SnowballPorterWithUnstemFilterFactory language=English protected=protwords.txt / /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1 / filter class=solr.LowerCaseFilterFactory / filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt / /analyzer /fieldType Thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-improve-query-result-time-tp3136554p3136554.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Dmitry Kan
@field for child object
hi, i wondering solrj @Field annotation support embedded child object ? e.g. class A { @field string somefield; @emebedded B b; } regards, kiwi
Spellchecker in zero-hit search result
Hi! I want my spellchecker component to return search query suggestions, regardless of the number of items in the search results. (Actually I'd find it most useful in zero-hit cases...) Currently I only get suggestions if the search returns one ore more hits. Example: q=place response result name=response numFound=20 start=0 maxScore=2.2373123/ lst name=spellcheck lst name=suggestions lst name=place int name=numFound4/int int name=startOffset0/int int name=endOffset5/int arr name=suggestion strplace/str strplaces/str strplaced/str /arr /lst str name=collationplace/str /lst /lst /response Example: q=placw response result name=response numFound=0 start=0 maxScore=0.0/ lst name=spellcheck lst name=suggestions/ /lst /response This is my spellchecker configuration (where I already fiddled around more than probably useful): searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str str name=fieldautocomplete/str float name=threshold0.005/float str name=accuracy0.1/str str name=buildOnCommittrue/str float name=thresholdTokenFrequency.001/float /lst /searchComponent requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest lst name=defaults str name=wtjson/str str name=spellchecktrue/str str name=spellcheck.dictionarysuggest/str str name=spellcheck.onlyMorePopularfalse/str str name=spellcheck.count4/str str name=spellcheck.collatetrue/str /lst arr name=components strsuggest/str /arr /requestHandler Did I misunderstand anything? Thanks!
configure dismax requesthandlar for boost a field
I want to apply boost for searching. i want that if a query term occur both in description,name than docs having query term in description field come high in search results. for this i configure dismax request handler as: *requestHandler name=dismax class=solr.DisMaxRequestHandler default=true lst name=defaults str name=echoParamsexplicit/str float name=tie0.01/float str name=qf text^0.5 name^1.0 description^1.5 /str str name=fl UID_PK,name,price,description /str str name=mm 2lt;-1 5lt;-2 6lt;90% /str int name=ps100/int str name=q.alt*:*/str str name=f.name.hl.fragsize0/str str name=f.name.hl.alternateFieldname/str str name=f.text.hl.fragmenterregex/str /lst /requestHandler* But i am not finding any effect in my search results. do i need to do some more configuration to see the effect. - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/configure-dismax-requesthandlar-for-boost-a-field-tp3137239p3137239.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How many fields can SOLR handle?
Nobody? I'm still confused about this -- View this message in context: http://lucene.472066.n3.nabble.com/How-many-fields-can-SOLR-handle-tp3033910p3137301.html Sent from the Solr - User mailing list archive at Nabble.com.
Problems using Solr with UIMA
Hi All I tried integrating UIMA in to Solr, following the instructions here: https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/uima/README.txt However, I set a solrconfig error, when I try to run Solr as a webapp, on Eclipse. org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory' But, the class does exist in the JAR snapshot created from the solr/contrib/uima. Any suggestions? I did search the past archives, but did not find anything addressing this particular error... S. -- Sowmya V.B. Losing optimism is blasphemy! http://vbsowmya.wordpress.com
what s the optimum size of SOLR indexes
Hi, What would be the maximum size of a single SOLR index file for resulting in optimum search time ? In case I have got to index all the documents in my repository (which is in TB size) what would be the ideal architecture to follow , distributed SOLR ? Regards, JAME VAALET Software Developer EXT :8108 Capital IQ
Re: what s the optimum size of SOLR indexes
There are Solutions for Indexing huge data. e.g. SolrCloud, ZooKeeperIntegration, MultiCore, MultiShard. depending on your requirement you can choose one or other. On 4 July 2011 17:21, Jame Vaalet jvaa...@capitaliq.com wrote: Hi, What would be the maximum size of a single SOLR index file for resulting in optimum search time ? In case I have got to index all the documents in my repository (which is in TB size) what would be the ideal architecture to follow , distributed SOLR ? Regards, JAME VAALET Software Developer EXT :8108 Capital IQ -- Thanks and Regards Mohammad Shariq
Re: Problems using Solr with UIMA
Hello Sowmya, Is the problem a ClassNotFoundException? If so check there exist a lib element referencing the solr-uima jar. Otherwise it may be some configuration error. By the way, which version of Solr are you using ? I ask since you're seeing README for trunk but you may be using Solr jars with different versions. Cheers, Tommaso 2011/7/4 Sowmya V.B. vbsow...@gmail.com Hi All I tried integrating UIMA in to Solr, following the instructions here: https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/uima/README.txt However, I set a solrconfig error, when I try to run Solr as a webapp, on Eclipse. org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory' But, the class does exist in the JAR snapshot created from the solr/contrib/uima. Any suggestions? I did search the past archives, but did not find anything addressing this particular error... S. -- Sowmya V.B. Losing optimism is blasphemy! http://vbsowmya.wordpress.com
Re: How many fields can SOLR handle?
Hi! I can't help you with the question about the limit to the number of fields. But until now I haven't read anywhere that there is a limit. So I'd assume that there is none. For your second question: Another question: Is it possible to add the FACET fields automatically to my query? facet.field=*_FACET? Now i do first a request to a DB to get the FACET titles and add this to the request: facet.field=cpu_FACET,gpu_FACET. I'm affraid that *_FACET is a overkill solution. You can add parameters automatically (as defaults) to your requests. Look into the solrconfig.xml file for requestHandler that handles your requests. (In the example it's the one starting requestHandler name=search class=solr.SearchHandler default=true). There is a lst name=defaults where you can add as many request parameters as you like. Is that what you're talking about? On Mon, Jul 4, 2011 at 13:44, roySolr royrutten1...@gmail.com wrote: Nobody? I'm still confused about this -- View this message in context: http://lucene.472066.n3.nabble.com/How-many-fields-can-SOLR-handle-tp3033910p3137301.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: what s the optimum size of SOLR indexes
On Mon, 2011-07-04 at 13:51 +0200, Jame Vaalet wrote: What would be the maximum size of a single SOLR index file for resulting in optimum search time ? There is no clear answer. It depends on the number of (unique) terms, number of documents, bytes on storage, storage speed, query complexity, faceting, number of concurrent users and a lot of other factors. In case I have got to index all the documents in my repository (which is in TB size) what would be the ideal architecture to follow , distributed SOLR ? A TB in source documents might very well end up as a simple, single machine index of 100GB or less. It depends on the amount of search relevant information in the documents, rather that their size in bytes. If your sources are Word-documents or a similar format with a relatively large amount of stuffing and your searches are mostly simple the user enters 2-5 verbs and hits enter, my guess is that you don't need to worry about distribution yet. Make a pilot. Most of the work you'll have to do for a single machine test can be reused for a distributed production setup.
Question regarding solr workflow
Hi, What is the workflow of solr starting from submitting an xml document to be indexed? Is there any default analyzer that is called before the analyzer specified in my solr schema for the text field. I have a situation where the words of the text field that will be analyzed if somehow splitted. For example if I have a text field ABC DEF, I can get it like AB C D EF. Thanks engy
Re: configure dismax requesthandlar for boost a field
On Mon, Jul 4, 2011 at 13:11, Romi romijain3...@gmail.com wrote: I want to apply boost for searching. i want that if a query term occur both in description,name than docs having query term in description field come high in search results. for this i configure dismax request handler as: *requestHandler name=dismax class=solr.DisMaxRequestHandler default=true lst name=defaults str name=echoParamsexplicit/str float name=tie0.01/float str name=qf text^0.5 name^1.0 description^1.5 /str str name=fl UID_PK,name,price,description /str str name=mm 2lt;-1 5lt;-2 6lt;90% /str int name=ps100/int str name=q.alt*:*/str str name=f.name.hl.fragsize0/str str name=f.name.hl.alternateFieldname/str str name=f.text.hl.fragmenterregex/str /lst /requestHandler* But i am not finding any effect in my search results. do i need to do some more configuration to see the effect. Did you return the score for the queries? Did you compare scores between trials with description^1.5 and, for example, description^10.0? Did you restart Solr after changes to solrconfig.xml? Marian
Re: Problems using Solr with UIMA
Hi Tommaso, I am using: Solr 3.3, that got released last week. The Readme on the Solr version I have also had the same info as the read me on that link. There exists a lib element in my solrconfig.xml. lib dir=../../dist/ regex=apache-solr-uima-\d.*\.jar / Here is my trace: from this, it seemed like a class not found exception. The server encountered an internal error (Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: abortOnConfigurationErrorfalse/abortOnConfigurationError in solr.xml - org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:389) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:423) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:445) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1569) at org.apache.solr.update.processor.UpdateRequestProcessorChain.init(UpdateRequestProcessorChain.java:57) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:447) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1553) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1547) at org.apache.solr.core.SolrCore.loadUpdateProcessorChains(SolrCore.java:620) at org.apache.solr.core.SolrCore.init(SolrCore.java:561) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:133) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4562) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5240) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5235) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) *Caused by: java.lang.ClassNotFoundException*: org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373) ... 25 more ) that prevented it from fulfilling this request. Thanks. Sowmya. On Mon, Jul 4, 2011 at 2:15 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Hello Sowmya, Is the problem a ClassNotFoundException? If so check there exist a lib element referencing the solr-uima jar. Otherwise it may be some configuration error. By the way, which version of Solr are you using ? I ask since you're seeing README for trunk but you may be using Solr jars with different versions. Cheers, Tommaso 2011/7/4 Sowmya V.B. vbsow...@gmail.com Hi All I tried integrating UIMA in to Solr, following the instructions here: https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/contrib/uima/README.txt However, I set a solrconfig error, when I try to run Solr as a webapp, on Eclipse. org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory' But, the class does exist in the JAR snapshot created from the solr/contrib/uima. Any suggestions? I did search the past archives, but did not find anything addressing this particular error... S. -- Sowmya V.B. Losing optimism is blasphemy! http://vbsowmya.wordpress.com -- Sowmya V.B. Losing optimism is blasphemy! http://vbsowmya.wordpress.com
Re: Problems using Solr with UIMA
Hello Sowmya, I've just made a fresh checkout from http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/ then I've done the following: 1. cd solr 2. ant example 3. cd solr/contrib/uima 4. ant dist 5. cd ../../example 6. edit solr/conf/solrconfig.xml 7. copied-pasted lib directives: lib dir=../../contrib/uima/lib / lib dir=../../dist/ regex=apache-solr-uima-\d.*\.jar / 8. copied-pasted updateRequestProcessorChain name=uima element at point 3 in README [1] inside solrconfig 9. created the request handler as in point 4 of README 10. then ran java -jar start.jar from command line It worked for me, since you said you were running the webapp from inside Eclipse I wonder if that's a classpath problem related to Eclipse. Hope this helps, Tommaso [1] : https://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/solr/contrib/uima/README.txt 2011/7/4 Sowmya V.B. vbsow...@gmail.com Hi Tommaso, I am using: Solr 3.3, that got released last week. The Readme on the Solr version I have also had the same info as the read me on that link. There exists a lib element in my solrconfig.xml. lib dir=../../dist/ regex=apache-solr-uima-\d.*\.jar / Here is my trace: from this, it seemed like a class not found exception. The server encountered an internal error (Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: abortOnConfigurationErrorfalse/abortOnConfigurationError in solr.xml - org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:389) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:423) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:445) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1569) at org.apache.solr.update.processor.UpdateRequestProcessorChain.init(UpdateRequestProcessorChain.java:57) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:447) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1553) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1547) at org.apache.solr.core.SolrCore.loadUpdateProcessorChains(SolrCore.java:620) at org.apache.solr.core.SolrCore.init(SolrCore.java:561) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:133) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4562) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5240) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5235) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) *Caused by: java.lang.ClassNotFoundException*: org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:373) ... 25 more ) that prevented it from fulfilling this request. Thanks. Sowmya. On Mon, Jul 4, 2011 at 2:15 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Hello Sowmya, Is the problem a ClassNotFoundException? If so check there exist a lib element referencing the solr-uima jar. Otherwise it may be some configuration error. By the way, which version of Solr are you using ? I ask since you're seeing README for trunk but you may be using Solr jars with different versions. Cheers, Tommaso 2011/7/4 Sowmya V.B. vbsow...@gmail.com Hi
Solr vs Hibernate Search (Huge number of DB DMLs)
Hi all, There were several places I could find a discussion on this but I failed to find the suited one for me. I'd like to be clear on my requirements, so that you may suggest me the better solution. - A project deals with tons of database tables (with *millions *of records) out of which some are to be indexed which should be searchable of-course. It uses Hibernate for MySQL transactions. As per my knowledge, there could be two solutions to maintain sync between index and database effectively. -- There'd be a *huge number of transactions (DMLs) on the DB*, so I'm wondering which one of the following will be able to handle it effectively. 1) Configure *Solr *server, query it to search / send events to update. This might be better than handling Lucene solely which provides index read/write and load balancing. The problem here could be to implement maintain sync between index and DB with no lag as the updations (DMLs on DB) are very frequent. Too many events to be sent! 2) Using *Hibernate Search*. I'm just wondering about its *performance*considering high volume of transactions on DB every minute. Please suggest. Thanks in advance.
Re: Problems using Solr with UIMA
Hello Tomasso It was indeed a relative path issue inside eclipse. I key-ed in the total path instead of ../../ and it ran without throwing an error. However, when I gave the path for index as an old lucene index directory's path and modified schema.xml accordingly, it still says numDocs = 0, on stats.jsp page. How can I tell Solr to use an already existing lucene index (which also used UIMA)... this is just to check if the integration works and ensure that i am on right track S. On Mon, Jul 4, 2011 at 2:55 PM, Tommaso Teofili tommaso.teof...@gmail.comwrote: Hello Sowmya, I've just made a fresh checkout from http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/ then I've done the following: 1. cd solr 2. ant example 3. cd solr/contrib/uima 4. ant dist 5. cd ../../example 6. edit solr/conf/solrconfig.xml 7. copied-pasted lib directives: lib dir=../../contrib/uima/lib / lib dir=../../dist/ regex=apache-solr-uima-\d.*\.jar / 8. copied-pasted updateRequestProcessorChain name=uima element at point 3 in README [1] inside solrconfig 9. created the request handler as in point 4 of README 10. then ran java -jar start.jar from command line It worked for me, since you said you were running the webapp from inside Eclipse I wonder if that's a classpath problem related to Eclipse. Hope this helps, Tommaso [1] : https://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/solr/contrib/uima/README.txt 2011/7/4 Sowmya V.B. vbsow...@gmail.com Hi Tommaso, I am using: Solr 3.3, that got released last week. The Readme on the Solr version I have also had the same info as the read me on that link. There exists a lib element in my solrconfig.xml. lib dir=../../dist/ regex=apache-solr-uima-\d.*\.jar / Here is my trace: from this, it seemed like a class not found exception. The server encountered an internal error (Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: abortOnConfigurationErrorfalse/abortOnConfigurationError in solr.xml - org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:389) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:423) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:445) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1569) at org.apache.solr.update.processor.UpdateRequestProcessorChain.init(UpdateRequestProcessorChain.java:57) at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:447) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1553) at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:1547) at org.apache.solr.core.SolrCore.loadUpdateProcessorChains(SolrCore.java:620) at org.apache.solr.core.SolrCore.init(SolrCore.java:561) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:463) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:133) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4562) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5240) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5235) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) *Caused by: java.lang.ClassNotFoundException*: org.apache.solr.uima.processor.UIMAUpdateRequestProcessorFactory at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at
Re: Custom Cache cleared after a commit?
On Mon, Jul 4, 2011 at 2:07 AM, arian487 akarb...@tagged.com wrote: I guess I'll have to use something other then SolrCache to get what I want then. Or I could use SolrCache and just change the code (I've already done so much of this anwyways...). Anyways thanks for the reply. You can specify a regenerator for your cache that examines items in the old cache and pre-populates the new cache when a commit happens. -Yonik http://www.lucidimagination.com
Re: Problems using Solr with UIMA
Hello Tomasso I noticed that though I can see the Solr Admin interface, when I click on links schema and conf, its not taking me to the pages inside solr/conf/ folder of the webapp, again, I guess because of eclipse paths. This is the stack trace on console: INFO: Solr home set to 'solr/./' Jul 4, 2011 4:57:58 PM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in classpath or 'solr/./conf/', cwd=/Users/svajjala/Documents/eclipse/Eclipse.app/Contents/MacOS at org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:268) at org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:234) at org.apache.solr.core.Config.init(Config.java:141) at org.apache.solr.core.SolrConfig.init(SolrConfig.java:131) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:435) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:316) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:133) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:273) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:254) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:372) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:98) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4562) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5240) at org.apache.catalina.core.StandardContext$2.call(StandardContext.java:5235) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680) Jul 4, 2011 4:57:58 PM org.apache.solr.servlet.SolrDispatchFilter init INFO: user.dir=/Users/svajjala/Documents/eclipse/Eclipse.app/Contents/MacOS Jul 4, 2011 4:57:58 PM org.apache.solr.servlet.SolrDispatchFilter init INFO: SolrDispatchFilter.init() done Jul 4, 2011 4:57:58 PM org.apache.solr.servlet.SolrServlet init INFO: SolrServlet.init() Jul 4, 2011 4:57:58 PM org.apache.solr.core.SolrResourceLoader locateSolrHome INFO: No /solr/home in JNDI Jul 4, 2011 4:57:58 PM org.apache.solr.core.SolrResourceLoader locateSolrHome INFO: solr home defaulted to 'solr/' (could not find system property or JNDI) Jul 4, 2011 4:57:58 PM org.apache.solr.servlet.SolrServlet init INFO: SolrServlet.init() done Jul 4, 2011 4:57:58 PM org.apache.solr.core.SolrResourceLoader locateSolrHome INFO: No /solr/home in JNDI Jul 4, 2011 4:57:58 PM org.apache.solr.core.SolrResourceLoader locateSolrHome INFO: solr home defaulted to 'solr/' (could not find system property or JNDI) Jul 4, 2011 4:57:58 PM org.apache.solr.servlet.SolrUpdateServlet init INFO: SolrUpdateServlet.init() done Jul 4, 2011 4:57:58 PM org.apache.coyote.AbstractProtocolHandler start INFO: Starting ProtocolHandler [http-bio-8080] Jul 4, 2011 4:57:58 PM org.apache.coyote.AbstractProtocolHandler start INFO: Starting ProtocolHandler [ajp-bio-8009] Jul 4, 2011 4:57:58 PM org.apache.catalina.startup.Catalina start INFO: Server startup in 3661 ms Jul 4, 2011 4:58:02 PM org.apache.solr.core.SolrCore execute INFO: [] webapp=/apache-solr-3.3.0 path=/admin/file/ params={file=schema.xmlcontentType=text/xml;charset%3Dutf-8} status=0 QTime=1 I used solr before...from command line...and I never had such errors. Iam new to IDE usage than Solr. So, I don't understand the path errors :( S On Mon, Jul 4, 2011 at 3:41 PM, Sowmya V.B. vbsow...@gmail.com wrote: Hello Tomasso It was indeed a relative path issue inside eclipse. I key-ed in the total path instead of ../../ and it ran without throwing an error. However, when I gave the path for index as an old lucene index directory's path and modified schema.xml accordingly, it still says numDocs = 0, on stats.jsp page. How can I tell Solr to use an already existing lucene index (which also used UIMA)... this is just to check if the integration works and ensure that i am on right track S. On Mon, Jul 4, 2011 at 2:55 PM, Tommaso Teofili tommaso.teof...@gmail.com wrote: Hello Sowmya, I've just made a fresh checkout from http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_3_3/ then I've done the following: 1. cd solr 2. ant example 3. cd solr/contrib/uima 4. ant dist 5. cd ../../example 6. edit solr/conf/solrconfig.xml 7. copied-pasted lib directives: lib dir=../../contrib/uima/lib / lib dir=../../dist/ regex=apache-solr-uima-\d.*\.jar / 8. copied-pasted
Re: Spellchecker in zero-hit search result
Hi Marian, I guess that your problem isn't related to the number of results, but to the component's configuration. The configuration that you show is meant to set up an autocomplete component that will suggest terms from an incomplete user input (something similar to what google does while you're typing in the search box), see http://wiki.apache.org/solr/Suggester. That's why your suggestions to place are places and placed, all sharing the place prefix. But when you search for placw, the component doesn't return any suggestion, because in your index no term begins with placw. You can learn how to correctly configure a spellchecker here: http://wiki.apache.org/solr/SpellCheckComponent. Also, I'd recommend to take a look at the example's solrconfig, because it provides an example spellchecker configuration. Regards, *Juan* On Mon, Jul 4, 2011 at 7:30 AM, Marian Steinbach marian.steinb...@gmail.com wrote: Hi! I want my spellchecker component to return search query suggestions, regardless of the number of items in the search results. (Actually I'd find it most useful in zero-hit cases...) Currently I only get suggestions if the search returns one ore more hits. Example: q=place response result name=response numFound=20 start=0 maxScore=2.2373123/ lst name=spellcheck lst name=suggestions lst name=place int name=numFound4/int int name=startOffset0/int int name=endOffset5/int arr name=suggestion strplace/str strplaces/str strplaced/str /arr /lst str name=collationplace/str /lst /lst /response Example: q=placw response result name=response numFound=0 start=0 maxScore=0.0/ lst name=spellcheck lst name=suggestions/ /lst /response This is my spellchecker configuration (where I already fiddled around more than probably useful): searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str str name=fieldautocomplete/str float name=threshold0.005/float str name=accuracy0.1/str str name=buildOnCommittrue/str float name=thresholdTokenFrequency.001/float /lst /searchComponent requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest lst name=defaults str name=wtjson/str str name=spellchecktrue/str str name=spellcheck.dictionarysuggest/str str name=spellcheck.onlyMorePopularfalse/str str name=spellcheck.count4/str str name=spellcheck.collatetrue/str /lst arr name=components strsuggest/str /arr /requestHandler Did I misunderstand anything? Thanks!
Is solrj 3.3.0 ready for field collapsing?
Hi, i've tried to add the params for group=true and group.field=myfield by using the SolrQuery. But the result is null. Do i have to configure something? In wiki part for field collapsing i couldn't find anything. Thanks Per
A beginner problem
I use nutch, as a search engine. Until now nutch did the crawl and the search functions. The newest version, however, delegated the search to solr. I don't know almost nothing about programming, but i'm able to follow a receipe. So I went to the the solr site, downloaded solr and tried to follow the tutorial. In the example folder of solr, using java -jar start.jar I got: 2011-07-04 13:22:38.439:INFO::Logging to STDERR via org.mortbay.log.StdErrLog 2011-07-04 13:22:38.893:INFO::jetty-6.1-SNAPSHOT 2011-07-04 13:22:38.946:INFO::Started SocketConnector@0.0.0.0:8983 When I tried to go to http://localhost:8983/solr/admin/ I got: HTTP ERROR: 404 Problem accessing /solr/admin/. Reason: NOT_FOUND Can someone help me with this? Tanks
Re: Solr vs Hibernate Search (Huge number of DB DMLs)
From my exploration so far, I understood that we can opt Solr straightaway if the index changes are kept to minimal. However, mine is absolutely the opposite. I'm still vague about the perfect solution for the scenario mentioned. Please share.. On Mon, Jul 4, 2011 at 6:28 PM, fire fox fyr3...@gmail.com wrote: Hi all, There were several places I could find a discussion on this but I failed to find the suited one for me. I'd like to be clear on my requirements, so that you may suggest me the better solution. - A project deals with tons of database tables (with *millions *of records) out of which some are to be indexed which should be searchable of-course. It uses Hibernate for MySQL transactions. As per my knowledge, there could be two solutions to maintain sync between index and database effectively. -- There'd be a *huge number of transactions (DMLs) on the DB*, so I'm wondering which one of the following will be able to handle it effectively. 1) Configure *Solr *server, query it to search / send events to update. This might be better than handling Lucene solely which provides index read/write and load balancing. The problem here could be to implement maintain sync between index and DB with no lag as the updations (DMLs on DB) are very frequent. Too many events to be sent! 2) Using *Hibernate Search*. I'm just wondering about its *performance*considering high volume of transactions on DB every minute. Please suggest. Thanks in advance.
Re: Question regarding solr workflow
On Mon, Jul 4, 2011 at 5:47 PM, Engy Morsy engy.mo...@bibalex.org wrote: What is the workflow of solr starting from submitting an xml document to be indexed? Is there any default analyzer that is called before the analyzer specified in my solr schema for the text field. I have a situation where the words of the text field that will be analyzed if somehow splitted. Only the analyzer specified in solr schema are applied. You can try the Analysis link on the Solr dashboard to see how the analysis is being done for a particular field. -- Regards, Shalin Shekhar Mangar.
Re: Index Version and Epoch Time?
: The index version shown on the dashboard is the time at which the most : recent index segment was created. I'm not sure why it has a value older than : a month if a commit has happened after that time. I'm fairly certian that's false. last time i checked, newly created indexes are assigned a version based on index time, but after that each commit simply imcrements the version - so index versions are only suitable for comparing if one instance of an index is newer or older then another instance of the same index -- it doesn't tell you anything about the relative age. -Hoss
Re: Index Version and Epoch Time?
On Tue, Jul 5, 2011 at 12:03 AM, Chris Hostetter hossman_luc...@fucit.orgwrote: : The index version shown on the dashboard is the time at which the most : recent index segment was created. I'm not sure why it has a value older than : a month if a commit has happened after that time. I'm fairly certian that's false. last time i checked, newly created indexes are assigned a version based on index time, but after that each commit simply imcrements the version - so index versions are only suitable for comparing if one instance of an index is newer or older then another instance of the same index -- it doesn't tell you anything about the relative age. Thanks for clearing that up Hoss. I only looked at a place where IndexCommit was being created and it used System.currentTimeMillis, hence the confusion. Anyways, what the version represents is not guaranteed except that it will uniquely identify a commit point so users should not make any assumptions. -- Regards, Shalin Shekhar Mangar.
Re: upgraded from 2.9 to 3.x, problems. help?
: i recently upgraded al systems for indexing and searching to lucene/solr 3.1, : and unfortunatly it seems theres a lot more changes under the hood than : there used to be. it wounds like you are saying you had a system that wsa working fine for you, but when you tried to upgrade it stoped working. : i have a java based indexer and a solr based searcher, on the java end for ... : Analyzer an = new StandardAnalyzer(Version.LUCENE_31, nostopwords); right off the bat, that line of code couldn't posisly have been in your existing 2.9 code (Version.LUCENE_31 didn't existing in 2.9) and instructs StandardAnalyzer to to some very basic things very differnetly then they were dong in 2.9... http://lucene.apache.org/java/3_1_0/api/all/org/apache/lucene/analysis/standard/StandardAnalyzer.html I would start by setting that to Version.LUCENE_29 to tell StandardAnalyzer that you want the same behavior as before. Having said all of that -- the LUCENE_31 is considered better then the LUCENE_29 behavior, so you should consider change that to get the benefits -- but you need to understand your full analysis stack to do that. : and for the solr end i have: ...you should also check if you added a luceneMatchVersion/ of LUCENE_31 to your solrconfig.xml -- if not do so so it's consistent with your external java code. generally speaking just having your indexer using an off the shelf analyzer while your solr instead uses something like WordDelimiterFilter isn't going to work well, you need to think about index time analysis and query time anslysis in conjunction with eachother. hang on, scratch that -- you may think you are using WordDelimiterFilterFactory, but you are not... : fieldType name=text class=solr.TextField positionIncrementGap=100 : : filter class=solr.WordDelimiterFilterFactory : generateWordParts=1 generateNumberParts=1 catenateWords=1 : catenateNumbers=1 catenateAll=0 splitOnCaseChange=0/ : analyzer class=org.apache.lucene.analysis.standard.StandardAnalyzer : ignoreCase=true / : /fieldType ...you can't just plop a filter/ tag in a fieldType/ like htat nad have it mean something. filter/ can be used when you are declaring an custom analyzer chain in the schema.xml, if you use analyzer class=... / you get a concrete analyzer that has hardcoded behavior. so if you aren't getting matches, it's a straight up discrepency between the LUCENE_31 and whatever seting you have in solrconfig.xml (which if you didn't add to your existing config, is going to be a legacy default ... 2.4 or 2.9 ... i can't remember) -Hoss
Re: A beginner problem
It's hard to find what is happening without more details about your setup. I would start by asking the questions: - Do you have a firewall installed? - What opperating system do you run solr on? - Can you ping the hostname localhost? Filype On Tue, Jul 5, 2011 at 4:49 AM, carmme...@qualidade.info wrote: I use nutch, as a search engine. Until now nutch did the crawl and the search functions. The newest version, however, delegated the search to solr. I don't know almost nothing about programming, but i'm able to follow a receipe. So I went to the the solr site, downloaded solr and tried to follow the tutorial. In the example folder of solr, using java -jar start.jar I got: 2011-07-04 13:22:38.439:INFO::Logging to STDERR via org.mortbay.log.StdErrLog 2011-07-04 13:22:38.893:INFO::jetty-6.1-SNAPSHOT 2011-07-04 13:22:38.946:INFO::Started SocketConnector@0.0.0.0:8983 When I tried to go to http://localhost:8983/solr/admin/ I got: HTTP ERROR: 404 Problem accessing /solr/admin/. Reason: NOT_FOUND Can someone help me with this? Tanks
Re: How do I compute and store a field?
You can create a custom update processor. The passed AddUpdateCommand object has an accessor to the SolrInputDocument you're about to add. In the processAdd method you can add a new field with whatever you want. The wiki has a good example: http://wiki.apache.org/solr/UpdateRequestProcessor Hello, I'm trying to add a field that counts the number of terms in a document to my schema. So far I've been computing this value at query-time. Is there how I could compute this once only and store the field? final SolrIndexSearcher searcher = request.getSearcher(); final SolrIndexReader reader = searcher.getReader(); final String content = content; final byte[] norms = reader.norms(content); final int[] docLengths; if (norms == null) { docLengths = null; } else { docLengths = new int[norms.length]; int i = 0; for (byte b : norms) { float docNorm = searcher.getSimilarity().decodeNormValue(b); int docLength = 0; if (docNorm != 0) { docLength = (int) (1 / docNorm); //reciprocal } docLengths[i++] = docLength; } ... final NumericField docLenNormField = new NumericField(TestQueryResponseWriter.DOC_LENGHT); docLenNormField.setIntValue(docLengths[id]); doc.add(docLenNormField);
Re: After the query component has the results, can I do more filtering on them?
: Sorry for the double post but in this case, is it possible for me to access : the queryResultCache in my component and play with it? Ideally what I want : is this: it could be possible to do what you're describing, but it would probabl be fairly brittle. i know you said earlier thta you can't use any eisting components, but i strongly urge you to post the details on *what* you wnat to do (ie: where are these scores coming from, how are the determined, how often do they change, do all of them change or just some of them, etc..) instead of *how* you want to do it (ie: modify the scores after the search) Even if an existing tool (like ExternalFileField) can't be used directly in your case, providing the full information about your use case may help people suggest a completley differnet approach then the one you're considering... http://people.apache.org/~hossman/#xyproblem XY Problem Your question appears to be an XY Problem ... that is: you are dealing with X, you are assuming Y will help you, and you are asking about Y without giving more details about the X so that we can understand the full issue. Perhaps the best solution doesn't involve Y at all? See Also: http://www.perlmonks.org/index.pl?node_id=542341 -Hoss
Re: How do I compute and store a field?
Gee, I was about to post. I figured my issue is that of computing the unique terms per document. One approach to compute that value is running the analyzer on the document before before calling addDocument, and count the number of tokens. Then I can invoke addDocument with the value of the field computed. The only issue is that I'm here making the assumption that if I use the same Analyzer addDocument used in addDocument then that will always equal the number of terms indexed for that document. Is that a right assumption? Any alternative where I don't need to make this assumption? On Tue, Jul 5, 2011 at 1:29 AM, Markus Jelsma markus.jel...@openindex.iowrote: You can create a custom update processor. The passed AddUpdateCommand object has an accessor to the SolrInputDocument you're about to add. In the processAdd method you can add a new field with whatever you want. The wiki has a good example: http://wiki.apache.org/solr/UpdateRequestProcessor Hello, I'm trying to add a field that counts the number of terms in a document to my schema. So far I've been computing this value at query-time. Is there how I could compute this once only and store the field? final SolrIndexSearcher searcher = request.getSearcher(); final SolrIndexReader reader = searcher.getReader(); final String content = content; final byte[] norms = reader.norms(content); final int[] docLengths; if (norms == null) { docLengths = null; } else { docLengths = new int[norms.length]; int i = 0; for (byte b : norms) { float docNorm = searcher.getSimilarity().decodeNormValue(b); int docLength = 0; if (docNorm != 0) { docLength = (int) (1 / docNorm); //reciprocal } docLengths[i++] = docLength; } ... final NumericField docLenNormField = new NumericField(TestQueryResponseWriter.DOC_LENGHT); docLenNormField.setIntValue(docLengths[id]); doc.add(docLenNormField); -- Regards, K. Gabriele --- unchanged since 20/9/10 --- P.S. If the subject contains [LON] or the addressee acknowledges the receipt within 48 hours then I don't resend the email. subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x) Now + 48h) ⇒ ¬resend(I, this). If an email is sent by a sender that is not a trusted contact or the email does not contain a valid code then the email is not received. A valid code starts with a hyphen and ends with X. ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈ L(-[a-z]+[0-9]X)).
Re: Feed index with analyzer output
: I will be more clear on the steps that I would like to take: : 1) Call the analyzer of Solr that returns me an XML response in the : following format (just a snippet as example) ... : 2) now I would like to be able to extract the info that I need from there : and tell Solr directly which things to index, telling him directly also : which are the tokens with their respective payload without performing more : analysis. can you explain a bit more about what you goal is here? what info are you planning on extracting? what do you intend to change between the info you get back in the first request and the info you want to send in the second request? Smells a little like an XY Problem... http://people.apache.org/~hossman/#xyproblem ...if you *really* wanted to do this you could, but you'd need different fieldnames for the preanalysis fields that you'd use in request#1 and the actual content that would be indexed/stored in request#2. your analyziers and whatnot for request#1 would be exactly what you're use to, but for request#2 you'd need to specify an analyzer that would let you specify, in the field value, the details about the term and position, and offsets, and payloads and what not ... the DelimitedPayloadTokenFilterFactory / DelimitedPayloadTokenFilter can help with some of that, but not all -- you'd either need your own custom analyzer or custom FieldType or something depending on teh specific changes you want to make. Frankly though i really believe you are going about this backwards -- if you want to manipulate the Tokenstream after analysis but before indexing, then why not implement this custom logic thta you want in a TokenFilter and use it in the last TokenFilterFactory you have for your analyzer? -Hoss
full text searching in cloud for minor enterprises
hi all, I want to provide full text searching for some small websites. It seems cloud computing is popular now. And it will save costs because it don't need employ engineer to maintain the machine. For now, there are many services such as amazon s3, google app engine, ms azure etc. I am not familiar with cloud computing. Anyone give me a direction or some advice? thanks
Re: A beginner problem
ya i agree with Filype Pereira Please put your problem in details . And check all thing what he says . Please also check in 8080 port - Regards Nilay Tiwari -- View this message in context: http://lucene.472066.n3.nabble.com/A-beginner-problem-tp3138118p3139667.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: configure dismax requesthandlar for boost a field
I am not returning score for the queries. as i suppose it should be reflected in search results. means doc having query string in description field come higher than the doc having query string in name field. And yes i restarted solr after making changes in configuration. - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/configure-dismax-requesthandlar-for-boost-a-field-tp3137239p3139680.html Sent from the Solr - User mailing list archive at Nabble.com.