Re: SOLR results case
Hi Dave, The stored content (which is returned in the results) isn't modified by the analyzers, so this shouldn't be a problem. Could you describe in more detail what you are doing and the results that you're getting? Thanks, *Juan* On Thu, Jan 5, 2012 at 2:17 PM, Dave dla...@gmail.com wrote: I'm running all of my indexed data and queries through a LowerCaseFilterFactory because I don't want to worry about case when matching. All of my results are titles - is there an easy way to restore case or convert all results to Title Case when returning them? My results are returned as JSON if that makes any difference. Thanks, Dave
Re: SOLR results case
Hi Dave, Have you tried running a query and taking a look at the results? The filters that you define in the fieldType don't affect the way the data is *stored*, it affects the way the data is *indexed*. With this I mean that the filters affect the way that a query matches a document, and will affect other features that rely on the *indexed* values (like faceting) but won't affect the way in which results are shown, which depends on the *stored* value. *Juan* On Thu, Jan 5, 2012 at 3:19 PM, Dave dla...@gmail.com wrote: Hi Juan, When I'm storing the content, the field has a LowerCaseFilterFactory filter, so that when I'm searching it's not case sensitive. Is there a way to re-filter the data when it's presented as a result to restore the case or convert to Title Case? Thanks, Dave On Thu, Jan 5, 2012 at 12:41 PM, Juan Grande juan.gra...@gmail.com wrote: Hi Dave, The stored content (which is returned in the results) isn't modified by the analyzers, so this shouldn't be a problem. Could you describe in more detail what you are doing and the results that you're getting? Thanks, *Juan* On Thu, Jan 5, 2012 at 2:17 PM, Dave dla...@gmail.com wrote: I'm running all of my indexed data and queries through a LowerCaseFilterFactory because I don't want to worry about case when matching. All of my results are titles - is there an easy way to restore case or convert all results to Title Case when returning them? My results are returned as JSON if that makes any difference. Thanks, Dave
Re: Highlighting in 3.5?
Hi Darren, Would you please tell us all the parameters that you are sending in the request? You can use the parameter echoParams=all to get the list in the output. Thanks, *Juan* On Mon, Jan 2, 2012 at 8:37 PM, Darren Govoni dar...@ontrenet.com wrote: Forgot to add, that the time when I DO want the highlight to appear would be with a query that DOES match the default field. {!lucene q.op=OR df=text_t} kind_s:doc AND (( field_t:[* TO *] )) cars Where the term 'cars' would be matched against the df. Then I want the highlight for it. If there are no query term matches for the df, then getting ALL the field terms highlighted (as it does now) is rather perplexing feature. Darren On 01/02/2012 06:28 PM, Darren Govoni wrote: Hi Juan, Setting that parameter produces the same extraneous results. Here is my query: {!lucene q.op=OR df=text_t} kind_s:doc AND (( field_t:[* TO *] )) Clearly, the default field (text_t) is not being searched by this query and highlighting it would be semantically incongruent with the query. Is it a bug? Darren On 01/02/2012 04:39 PM, Juan Grande wrote: Hi Darren, This is the expected behavior. Have you tried setting the hl.requireFieldMatch parameter to true? See: http://wiki.apache.org/solr/**HighlightingParameters#hl.** requireFieldMatchhttp://wiki.apache.org/solr/HighlightingParameters#hl.requireFieldMatch *Juan* On Mon, Jan 2, 2012 at 10:54 AM, Darren Govonidar...@ontrenet.com wrote: Hi, Can someone tell me if this is correct behavior from Solr. I search on a dynamic field: field_t:[* TO *] I set highlight fields to field_t,text_t but I am not searching specifically inside text_t field. The highlights for text_t come back with EVERY WORD. Maybe because of the [* TO *], but the query semantics indicate not searching on text_t even though highlighting is enabled. Is this correct behavior? it produces unwanted highlight results. I would expect Solr to know what fields are participating in the query and only highlight those that are involved in the result set. Thanks, Darren
Re: Highlighting in 3.5?
Hi Darren, This is the expected behavior. Have you tried setting the hl.requireFieldMatch parameter to true? See: http://wiki.apache.org/solr/HighlightingParameters#hl.requireFieldMatch *Juan* On Mon, Jan 2, 2012 at 10:54 AM, Darren Govoni dar...@ontrenet.com wrote: Hi, Can someone tell me if this is correct behavior from Solr. I search on a dynamic field: field_t:[* TO *] I set highlight fields to field_t,text_t but I am not searching specifically inside text_t field. The highlights for text_t come back with EVERY WORD. Maybe because of the [* TO *], but the query semantics indicate not searching on text_t even though highlighting is enabled. Is this correct behavior? it produces unwanted highlight results. I would expect Solr to know what fields are participating in the query and only highlight those that are involved in the result set. Thanks, Darren
Re: Grouping results after Sorting or vice-versa
Hi, I don't have an answer, but maybe I can help you if you provide more information, for example: - Which Solr version are you running? - Which is the type of the date field? - The output you are getting - The output you expect - Any other information that you consider relevant. Thanks, *Juan* On Wed, Dec 28, 2011 at 5:12 AM, vijayrs ragavansvi...@gmail.com wrote: The issue i'm facing is... I didn't get the expected results when i combine group param and sort param. The query is... http://localhost:8080/solr/core1/select/?qt=nutchq=*:*fq=userid:333group=truegroup.field=threadidgroup.sort=date%20descsort=date%20desc where threadid is a hexadecimal string which is common for more than 1 message, and date is in unix timestamp format. The results should be sorted based on date and also grouped by threadid... how it can be done? -- View this message in context: http://lucene.472066.n3.nabble.com/Grouping-results-after-Sorting-or-vice-versa-tp3615957p3615957.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom Solr FunctionQuery Error
Hi Parvin, You must also add the query parser definition to solrconfig.xml, for example: queryParser name=graph class=org.gasimzade.solr.GraphQParserPlugin/ *Juan* On Wed, Dec 28, 2011 at 4:16 AM, Parvin Gasimzade parvin.gasimz...@gmail.com wrote: Hi all, I have created custom Solr FunctionQuery in Solr 3.4. I extended ValueSourceParser, ValueSource, Query and QParserPlugin classes. I set the name parameter as graph inside GraphQParserPlugin class. But when try to search i got an error. Search queries are http://localhost:8080/solr/select/?q={!graph}test http://localhost:8080/recomm/select/?q=%7B!graph%7Dtesthttp://localhost:8080/recomm/select/?q=%7B%21graph%7Dtest http://localhost:8080/solr/select/?q=testdefType=graph http://localhost:8080/recomm/select/?q=testdefType=graph I also add the *valueSourceParser name=graph class=org.gasimzade.solr.GraphValueSourceParser/ *into solrConfig.xml but i got the same error... Error message is : Dec 27, 2011 7:05:20 PM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Unknown query type 'graph' at org.apache.solr.core.SolrCore.getQueryPlugin(SolrCore.java:1517) at org.apache.solr.search.QParser.getParser(QParser.java:316) at org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:80) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:88) at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:76) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:224) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:185) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:472) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:151) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:929) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:405) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:269) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:515) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:300) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:679) Thank you for your help. Best Regards, Parvin
Re: Trim and copy a solr field
Hi Swapna, Do you want to modify the *indexed* value or the *stored* value? The analyzers modify the indexed value. To modify the stored value, the only option that I'm aware of is to write an UpdateProcessor that changes the document before it's indexed. *Juan* On Tue, Dec 13, 2011 at 2:05 AM, Swapna Vuppala swapna.vupp...@arup.comwrote: Hi Juan, Thanks for the reply. I tried using this, but I don't see any effect of the analyzer/filter. I tried copying my Solr field to another field of the type defined below. Then I indexed couple of documents with the new schema, but I see that both fields have got the same value. Am looking at the indexed data in Luke. Am assuming that analyzers process the field value (as specified by various filters etc) and then store the modified value. Is that true ? What else could I be missing here ? Thanks and Regards, Swapna. -Original Message- From: Juan Grande [mailto:juan.gra...@gmail.com] Sent: Monday, December 12, 2011 11:50 PM To: solr-user@lucene.apache.org Subject: Re: Trim and copy a solr field Hi Swapna, You could try using a copyField to a field that uses PatternReplaceFilterFactory: fieldType class=solr.TextField name=path_location analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.PatternReplaceFilterFactory pattern=(.*)/.* replacement=$1/ /analyzer /fieldType The regular expression may not be exactly what you want, but it will give you an idea of how to do it. I'm pretty sure there must be some other ways of doing this, but this is the first that comes to my mind. *Juan* On Mon, Dec 12, 2011 at 4:46 AM, Swapna Vuppala swapna.vupp...@arup.com wrote: Hi, I have a Solr field that contains the absolute path of the file that is indexed, which will be something like file:/myserver/Folder1/SubFol1/Sub-Fol2/Test.msgfile:///\\myserver\Folder1\SubFol1\Sub-Fol2\Test.msg. Am interested in indexing the location in a separate field. I was looking for some way to trim the field value from last occurrence of char /, so that I can get the location value, something like file:/myserver/Folder1/SubFol1/Sub-Fol2file:///\\myserver\Folder1\SubFol1\Sub-Fol2, and store it in a new field. Can you please suggest some way to achieve this ? Thanks and Regards, Swapna. Electronic mail messages entering and leaving Arup business systems are scanned for acceptability of content and viruses
Re: limiting the content of content field in search results
Hi, It sounds like highlighting might be the solution for you. See http://wiki.apache.org/solr/HighlightingParameters *Juan* On Mon, Dec 12, 2011 at 4:42 AM, ayyappan ayyaba...@gmail.com wrote: I am developing n application which indexes whole pdfs and other documents to solr. I have completed a working version of my application. But there are some problems. The main one is that when I do a search the indexed whole document is shown. I have used solrj and need some help to reduce this content. How limiting the content of content field in search results and display over there . i need like this *Grammer1.docx* Blazing – burring Faceted Cluster – to gather Geospatial Replication – coping Distinguish – apart from Flawlessly – perfectly Recipe –method Concentrated inscription Last Modified : 2011-12-11T14:42:27Z *who.pdf* Who We Are Table of contents 1 Solr Committers (in alphabetical order)fgfgfgfg2 2 Inactive Committers (in alphabetical orde *version_control.pdf* Solr Version Control System Table of contents 1 Overview.gfgfgfg 2 Web Acce -- View this message in context: http://lucene.472066.n3.nabble.com/limiting-the-content-of-content-field-in-search-results-tp3578859p3578859.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Trim and copy a solr field
Hi Swapna, You could try using a copyField to a field that uses PatternReplaceFilterFactory: fieldType class=solr.TextField name=path_location analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.PatternReplaceFilterFactory pattern=(.*)/.* replacement=$1/ /analyzer /fieldType The regular expression may not be exactly what you want, but it will give you an idea of how to do it. I'm pretty sure there must be some other ways of doing this, but this is the first that comes to my mind. *Juan* On Mon, Dec 12, 2011 at 4:46 AM, Swapna Vuppala swapna.vupp...@arup.comwrote: Hi, I have a Solr field that contains the absolute path of the file that is indexed, which will be something like file:/myserver/Folder1/SubFol1/Sub-Fol2/Test.msgfile:///\\myserver\Folder1\SubFol1\Sub-Fol2\Test.msg. Am interested in indexing the location in a separate field. I was looking for some way to trim the field value from last occurrence of char /, so that I can get the location value, something like file:/myserver/Folder1/SubFol1/Sub-Fol2file:///\\myserver\Folder1\SubFol1\Sub-Fol2, and store it in a new field. Can you please suggest some way to achieve this ? Thanks and Regards, Swapna. Electronic mail messages entering and leaving Arup business systems are scanned for acceptability of content and viruses
Re: Using result grouping with SolrJ
Hi Kissue, Support for grouping on SolrJ was added in Solr 3.4, see https://issues.apache.org/jira/browse/SOLR-2637 In previous versions you can access the grouping results by simply traversing the various named lists. *Juan* On Wed, Dec 7, 2011 at 1:22 PM, Kissue Kissue kissue...@gmail.com wrote: Hi, I am using Solr 3.3 with SolrJ. Does anybody know how i can use result grouping with SolrJ? Particularly how i can retrieve the result grouping results with SolrJ? Any help will be much appreciated. Thanks.
Re: where is the SOLR_HOME ?
Hi Ahmad, While Solr is starting it writes the path to SOLR_HOME to the log. The message looks something like: Sep 14, 2011 9:14:53 AM org.apache.solr.core.SolrResourceLoader init INFO: Solr home set to 'solr/' If you're running the example, SOLR_HOME is usually apache-solr-3.3.0/example/solr Solr also writes a line like the following in the log for every JAR file it loads: Sep 14, 2011 9:14:53 AM org.apache.solr.core.SolrResourceLoader replaceClassLoader INFO: Adding 'file:/home/jgrande/apache-solr-3.3.0/contrib/extraction/lib/pdfbox-1.3.1.jar' to classloader With this information you should be able to determine which JAR files Solr is loading and I'm pretty sure that it's loading all the files you need. The problem may be that you must also include apache-solr-analysis-extras-3.3.0.jar from the apache-solr-3.3.0/dist directory. Regards, *Juan* On Wed, Sep 14, 2011 at 12:19 AM, ahmad ajiloo ahmad.aji...@gmail.comwrote: Hi In this page ( http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUTokenizerFactory http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUTokenizerFactory ) http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUTokenizerFactory said: Note: to use this filter, see solr/contrib/analysis-extras/README.txt for instructions on which jars you need to add to your SOLR_HOME/lib I can't find SOLR_HOME/lib ! 1- Is there: apache-solr-3.3.0\example\solr ? there is no directory which name is lib I created example/solr/lib directory and copied jar files to it and tested this expressions in solrconfig.xml : lib dir=../../example/solr/lib / lib dir=./lib / lib dir=../../../example/solr/lib / (for more assurance!!!) but it doesn't work and still has following errors ! 2- or: apache-solr-3.3.0\ ? there is no directory which name is lib 3- or : apache-solr-3.3.0\example ? there is a lib directory. I copied 4 libraries exist in solr/contrib/analysis-extras/ to apache-solr-3.3.0\example\lib but some errors exist in loading page http://localhost:8983/solr/admin; : I use Nutch to crawling the web and fetching web pages. I send data of Nutch to Solr for Indexing. according to Nutch tutorial ( http://wiki.apache.org/nutch/NutchTutorial#A6._Integrate_Solr_with_Nutch) I should copy schema.xml of Nutch to conf directory of Solr. So I added all of my required Analyzer like *ICUNormalizer2FilterFactory *to this new shema.xml this is schema.xml : -I added bold text to this file ?xml version=1.0 encoding=UTF-8 ? schema name=nutch version=1.3 types fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ fieldType name=long class=solr.TrieLongField precisionStep=0 omitNorms=true positionIncrementGap=0/ fieldType name=float class=solr.TrieFloatField precisionStep=0 omitNorms=true positionIncrementGap=0/ fieldType name=date class=solr.TrieDateField precisionStep=0 omitNorms=true positionIncrementGap=0/ fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType *fieldType name=text_icu class=solr.TextField autoGeneratePhraseQueries=false analyzer tokenizer class=solr.ICUTokenizerFactory/ /analyzer /fieldType fieldType name=icu_sort_en class=solr.TextField analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ICUCollationKeyFilterFactory locale=en strength=primary/ /analyzer /fieldType fieldType name=normalized class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ICUNormalizer2FilterFactory name=nfkc_cf mode=compose/ /analyzer /fieldType fieldType name=folded class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer /fieldType fieldType name=transformed class=solr.TextField analyzer
Re: geodist() parameters?
Hi Bill, As far as I know, you can pass a completely different set of parameters to each of the functions/filters. For example: http://localhost:8983/solr/select?q={!func}add(geodist(field1, 10, -10),geodist(field2, 20, -20))fq={!geofilt sfield=field3 pt=30,-30 d=100}http://localhost:8983/solr/select?q=%7B%21func%7Dadd%28geodist%28%29,geodist%28%29%29fq=%7B%21geofilt%7Dpt=39.86347,-105.04888d=100sfield=store_lat_lon Let me know if this solved your problem! *Juan* On Wed, Aug 31, 2011 at 11:58 PM, William Bell billnb...@gmail.com wrote: I want to go a geodist() calculation on 2 different sfields. How would I do that? http://localhost:8983/solr/select?q={!func}add(geodist(),geodist())fq={!geofilt}pt=39.86347,-105.04888d=100sfield=store_lat_lon But I really want geodist() for one pt, and another geodist() for another pt. Can I do something like geodist(store_lat_lon,39.86347,-105.04888,100) ? -- Bill Bell billnb...@gmail.com cell 720-256-8076
Re: How to list all dynamic fields of a document using solrj?
Hi Michael, It's supposed to work. Can we see a snippet of the code you're using to retrieve the fields? *Juan* On Mon, Aug 29, 2011 at 8:33 AM, Michael Szalay michael.sza...@basis06.chwrote: Hi all how can I list all dynamic fields and their values of a document using solrj? The dynamic fields are never returned when I use setFields(*). Thanks Michael -- Michael Szalay Senior Software Engineer basis06 AG, Birkenweg 61, CH-3013 Bern - Fon +41 31 311 32 22 http://www.basis06.ch - source of smart business
Re: Exact Match using Copy Fields
Hi, are you sure you're using dismax query parser? Make sure you have the parameter defType=dismax in your request. *Juan* On Thu, Aug 18, 2011 at 11:22 AM, jyn7 jyotsna.namb...@gmail.com wrote: Hi, I am trying to achieve an exact match search on a text field. I am using a copy field and copying it to a string and using that for the search. field name=imprint type=text indexed=true stored=true/ field name=author type=text indexed=true stored=true/ field name=author_exact type=string indexed=true stored=false/ field name=imprint_exact type=string indexed=true stored=false/ copyField source=author dest=author_exact/ copyField source=imprint dest=imprint_exact/ and now I want to do an exact match on the imprint field and am trying to search using the below, and the results are not limited to imprint_exact, I even get the results with author_exact having the queried value. facet=trueqf=imprint_exactfl=*,scorefq=published_on:[* TO NOW]q=Cris Williamsonstart=0rows=10 Can anyone help me correct this? Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Exact-Match-using-Copy-Fields-tp3265027p3265027.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Possible to use quotes in dismax qf?
Hi, You can use the pf parameter of the DismaxQParserPlugin: http://wiki.apache.org/solr/DisMaxQParserPlugin#pf_.28Phrase_Fields.29 This parameter receives a list of fields using the same syntax as the qf parameter. After determining the list of matching documents, DismaxQParserPlugin will boost the docs where the terms of the query match as a phrase in the one of those fields. You can also use the ps field to set a phrase slop and boost docs where the terms appear in close proximity instead of as an exact phrase. Regards, *Juan* On Thu, Jul 28, 2011 at 11:00 AM, O. Klein kl...@octoweb.nl wrote: I want to do a dismax search to search for original query and this query as a phrasequery: q=sail boat needs to be converted to dismax query q=sail boat sail boat qf=title^10 content^2 What is best way to do this? -- View this message in context: http://lucene.472066.n3.nabble.com/Possible-to-use-quotes-in-dismax-qf-tp3206762p3206762.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: deletedPkQuery fails
Hi Elaine, I think you have a syntax error in your query. I'd recommend you to first try the query using a SQL client, until you get it right. This part seems strange to me: and pl.deleted='' having count(*)=0 *Juan* On Wed, Jul 13, 2011 at 5:09 PM, Elaine Li elaine.bing...@gmail.com wrote: Hi Folks, I am trying to use the deletedPkQuery to enable deltaImport to remove the inactive products from solr. I am keeping getting the syntax error saying the query syntax is not right. I have tried many alternatives to the following query. Although all of them work in the mysql prompt directly, no one works in solr handler. Can anyone give me some hint to debug this type of problem? Is there anything special about deletedPkQuery I am not aware of? deletedPkQuery=select p.pId as id from products p join products_large pl on p.pId=pl.pId where p.pId= ${dataimporter.delta.id} and pl.deleted='' having count(*)=0 Jul 13, 2011 4:02:23 PM org.apache.solr.handler.dataimport.DataImporter doDeltaImport SEVERE: Delta Import Failed org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: select p.pId as id from products p join products_large pl on p.pId=pl.pI d where p.pId= and pl.deleted='' having count(*)=0 Processing Document # 1 at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:253) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39) at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:58) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextDeletedRowKey(SqlEntityProcessor.java:91) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextDeletedRowKey(EntityProcessorWrapper.java:258) at org.apache.solr.handler.dataimport.DocBuilder.collectDelta(DocBuilder.java:636) at org.apache.solr.handler.dataimport.DocBuilder.doDelta(DocBuilder.java:258) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:172) at org.apache.solr.handler.dataimport.DataImporter.doDeltaImport(DataImporter.java:352) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:391) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL serv er version for the right syntax to use near 'and pl.deleted='' having count(*)=0' at line 1 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source) at java.lang.reflect.Constructor.newInstance(Unknown Source) at com.mysql.jdbc.Util.handleNewInstance(Util.java:407) at com.mysql.jdbc.Util.getInstance(Util.java:382) at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1052) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3603) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3535) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1989) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2150) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2620) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2570) at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:779) at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:622) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:246) Elaine
Re: can't get moreLikeThis to work
Hi Elain, You can add the indent=true parameter to the request to get a tidier output. Firefox usually ignores tabs when showing XML, so I'd suggest to choose View page source in that case. The documentation seems to suggest to have stored=true for the fields though, not sure why. Maybe someone else can give more details here, but as I understand, if you store the term vectors, then MLT will use that information to compute similarity. If you don't store the term vectors, but you store the content, then MLT will iterate over the content of the field (more precisely, the first mlt.maxntp tokens) and generate the term vectors. In that case, the document is analyzed again, so it's a slower approach. If you don't store any of them, then you get no results :) *Juan* On Fri, Jul 8, 2011 at 11:08 AM, Elaine Li elaine.bing...@gmail.com wrote: Guan and Koji, thank you both! After I changed to termVectors = true, it returns the results as expected. I flipped the stored=true|false for two fields: text and category_text and compared the results and don't see any difference. The documentation seems to suggest to have stored=true for the fields though, not sure why. The debugOn=true triggered debug message is a little difficult to read visually with everything in a big long string. After I turn on the wt=json, it becomes slight better with some spaces. Is there a hierarchical display of the debug message separated into multiple lines with indents so it is easier to digest? Elaine On Thu, Jul 7, 2011 at 8:48 PM, Koji Sekiguchi k...@r.email.ne.jp wrote: Plus, debugQuery=on would help you when using MLT after 3.1: https://issues.apache.org/jira/browse/SOLR-860 koji -- http://www.rondhuit.com/en/ (11/07/08 6:55), Juan Grande wrote: Hi Elaine, The first thing that comes to my mind is that neither the content nor the term vectors of text and category_text fields are being stored. Check the name of the parameter used to store the term vectors, which actually is termVectors and not term_vectored (see http://wiki.apache.org/solr/SchemaXml#Expert_field_options). Try changing that and tell us if it worked! Regards, *Juan* On Thu, Jul 7, 2011 at 4:44 PM, Elaine Lielaine.bing...@gmail.com wrote: Hi Folks, This is my configuration for mlt in solrconfig.xml requestHandler name=/mlt class=org.apache.solr.handler.MoreLikeThisHandler lst name=defaults str name=mlt.flname,text,category_text/str int name=mlt.mintf2/int int name=mlt.mindf1/int int name=mlt.minwl3/int int name=mlt.maxwl1000/int int name=mlt.maxqt50/int int name=mlt.maxntp5000/int bool name=mlt.boosttrue/bool str name=mlt.qfname,text,category_text/str str name=mlt.interestingTerms/str /lst /requestHandler I also defined the three fields to have term_vectored attribute in schema.xml field name=name type=text_nostem indexed=true stored=true term_vectored=true/ field name=text type=text_nostem indexed=true stored=false multiValued=true term_vectored=true/ field name=category_text type=text_strip_id indexed=true stored=false multiValued=true term_vectored=true/ When i submit the query http://localhost:8983/solr/mlt?q=id:69134mlt.count=10;, the return only contains one document with id=69134. Does anyone know or can guess what I missed? Thanks. Elaine
Re: bug in ExtractingRequestHandler with PDFs and metadata field Category
Hi Andras, I added str name=uprefixmetadata_/str so all PDF metadata fields should be saved in solr as metadata_something fields. The problem is that the Category metadata field from the PDF for some reason is not prefixed with metadata_ and solr will merge the Category field I have in the schema with the Category metadata from PDF This is the expected behavior, as it's described in http://wiki.apache.org/solr/ExtractingRequestHandler: uprefix=prefix - Prefix all fields that are not defined in the schema with the given prefix. You can use the fmap parameter to redirect the category metadata to another field. Regards, *Juan* On Thu, Jul 7, 2011 at 10:44 AM, Andras Balogh and...@reea.net wrote: Hi, I think this is a bug but before reporting to issue tracker I thought I will ask it here first. So the problem is I have a PDF file which among other metadata fields like Author, CreatedDate etc. has a metadata field Category (I can see all metadata fields with tika-app.jar started in GUI mode). Now what happens that in my SOLR schema I have a Category field also among other fields and a field called text that is holding the extracted text from the PDF. I added str name=uprefixmetadata_/str so all PDF metadata fields should be saved in solr as metadata_something fields. The problem is that the Category metadata field from the PDF for some reason is not prefixed with metadata_ and solr will merge the Category field I have in the schema with the Category metadata from PDF and I will have an error like: multiple values encountered for non multiValued field Category I fixed this by patching tika-parsers.jar and will ignore the Category metadata in org.apache.tika.parser.pdf.**PDFParser but this is not the good solution( I don't need that Category metadata so it works for me). So let me know if this should be reported as bug or not. Regards, Andras.
Re: updating documents while keeping unspecified fields
Hi Adeel, As far as I know, this isn't possible yet, but some work is being done: https://issues.apache.org/jira/browse/SOLR-139 https://issues.apache.org/jira/browse/SOLR-828 Regards, *Juan* On Thu, Jul 7, 2011 at 2:24 PM, Adeel Qureshi adeelmahm...@gmail.comwrote: What I am trying to do is to update a document information while keeping data for the fields that arent being specified in the update. So e.g. if this is the schema doc field name=id123/field field name=titlesome title/field field name=statusactive/field /doc if i send doc field name=id123/field field name=statusclosed/field /doc it should update the status to be closed for this document but not wipe out title since it wasnt provided in the updated data. Is that possible by using some flags or something ??? Thanks Adeel
Re: can't get moreLikeThis to work
Hi Elaine, The first thing that comes to my mind is that neither the content nor the term vectors of text and category_text fields are being stored. Check the name of the parameter used to store the term vectors, which actually is termVectors and not term_vectored (see http://wiki.apache.org/solr/SchemaXml#Expert_field_options). Try changing that and tell us if it worked! Regards, *Juan* On Thu, Jul 7, 2011 at 4:44 PM, Elaine Li elaine.bing...@gmail.com wrote: Hi Folks, This is my configuration for mlt in solrconfig.xml requestHandler name=/mlt class=org.apache.solr.handler.MoreLikeThisHandler lst name=defaults str name=mlt.flname,text,category_text/str int name=mlt.mintf2/int int name=mlt.mindf1/int int name=mlt.minwl3/int int name=mlt.maxwl1000/int int name=mlt.maxqt50/int int name=mlt.maxntp5000/int bool name=mlt.boosttrue/bool str name=mlt.qfname,text,category_text/str str name=mlt.interestingTerms/str /lst /requestHandler I also defined the three fields to have term_vectored attribute in schema.xml field name=name type=text_nostem indexed=true stored=true term_vectored=true/ field name=text type=text_nostem indexed=true stored=false multiValued=true term_vectored=true/ field name=category_text type=text_strip_id indexed=true stored=false multiValued=true term_vectored=true/ When i submit the query http://localhost:8983/solr/mlt?q=id:69134mlt.count=10;, the return only contains one document with id=69134. Does anyone know or can guess what I missed? Thanks. Elaine
Re: Query does not work when changing param order
Hi Juan! I think your problem is that in the second case the FieldQParserPlugin is building a phrase query for mytag myothertag. I recommend you to split the filter in two different filters, one for each tag. If each tag is used in many different filters, and the combination of tags is rarely repeated, this will also result in a more efficient use of filterCache. Regards, *Juan* On Thu, Jul 7, 2011 at 12:07 PM, Juan Manuel Alvarez naici...@gmail.comwrote: Hi everyone! I would like to ask you a question about a problem I am facing with a Solr query. I have a field tags of type textgen and some documents with the values myothertag,mytag. When I use the query: /solr/select?sort=name_sort+ascstart=0qf=tagsq.alt=*:*fq={!field q.op=AND f=tags}myothertag mytagrows=60defType=dismax everything works as expected, but if I change the order of the parameters in the fq, like this /solr/select?sort=name_sort+ascstart=0qf=tagsq.alt=*:*fq={!field q.op=AND f=tags}mytag myothertagrows=60defType=dismax I get no results. As far as I have seen, the textgen fieldshould tokenize the words in the field, so if I use comma-separated values, like in my example, both words are going to be indexed. Can anyone please point me in the right direction? Cheers! Juan M.
Re: The OR operator in a query ?
Hi, This two are valid and equivalent: - fq=sometag:1 OR sometag:5 - fq=sometag:(1 OR 5) Also, beware that fq defines a filter query, which is different from a regular query (http://wiki.apache.org/solr/CommonQueryParameters#fq). For more details on the query syntax see http://lucene.apache.org/java/2_4_0/queryparsersyntax.html Regards, *Juan* On Tue, Jul 5, 2011 at 3:15 PM, duddy67 san...@littlemarc.com wrote: Hi all, Someone could tell me what is the OR syntax in SOLR and how to use it in a search query ? I tried: fq=sometag:1+sometag:5 fq=sometag:[1+5] fq=sometag:[1OR5] fq=sometag:1+5 and many more but impossible to get what I want. Thanks for advance -- View this message in context: http://lucene.472066.n3.nabble.com/The-OR-operator-in-a-query-tp3141843p3141843.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Spellchecker in zero-hit search result
Hi Marian, I guess that your problem isn't related to the number of results, but to the component's configuration. The configuration that you show is meant to set up an autocomplete component that will suggest terms from an incomplete user input (something similar to what google does while you're typing in the search box), see http://wiki.apache.org/solr/Suggester. That's why your suggestions to place are places and placed, all sharing the place prefix. But when you search for placw, the component doesn't return any suggestion, because in your index no term begins with placw. You can learn how to correctly configure a spellchecker here: http://wiki.apache.org/solr/SpellCheckComponent. Also, I'd recommend to take a look at the example's solrconfig, because it provides an example spellchecker configuration. Regards, *Juan* On Mon, Jul 4, 2011 at 7:30 AM, Marian Steinbach marian.steinb...@gmail.com wrote: Hi! I want my spellchecker component to return search query suggestions, regardless of the number of items in the search results. (Actually I'd find it most useful in zero-hit cases...) Currently I only get suggestions if the search returns one ore more hits. Example: q=place response result name=response numFound=20 start=0 maxScore=2.2373123/ lst name=spellcheck lst name=suggestions lst name=place int name=numFound4/int int name=startOffset0/int int name=endOffset5/int arr name=suggestion strplace/str strplaces/str strplaced/str /arr /lst str name=collationplace/str /lst /lst /response Example: q=placw response result name=response numFound=0 start=0 maxScore=0.0/ lst name=spellcheck lst name=suggestions/ /lst /response This is my spellchecker configuration (where I already fiddled around more than probably useful): searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.tst.TSTLookup/str str name=fieldautocomplete/str float name=threshold0.005/float str name=accuracy0.1/str str name=buildOnCommittrue/str float name=thresholdTokenFrequency.001/float /lst /searchComponent requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest lst name=defaults str name=wtjson/str str name=spellchecktrue/str str name=spellcheck.dictionarysuggest/str str name=spellcheck.onlyMorePopularfalse/str str name=spellcheck.count4/str str name=spellcheck.collatetrue/str /lst arr name=components strsuggest/str /arr /requestHandler Did I misunderstand anything? Thanks!
Re: Feed index with analyzer output
Hi Lox, But I would also like to retain the original non-analyzed field for diplaying purposes. Actually, for stored fields, Solr always retains the original non-analyzed content, which is the one included in the response. So, if I'm not missing something, you don't need to separate the analysis (Solr does this for you!), just configure the analysis that you want for the indexed fields, and the stored content will be saved vertatim. Regards, *Juan* On Sat, Jul 2, 2011 at 7:17 AM, Lox lorenzo.fur...@gmail.com wrote: Hi, I'm trying to achieve a sort of better separation between the analysis of a document (tokenizing, filtering ecc.) and the indexing (storing). Now, I would like my application to call the analyzer (/analysis/document) via REST which returns the various tokens in xml format, then feed these data to the index directly without doing the analysis again. But I would also like to retain the original non-analyzed field for diplaying purposes. This can probably be achieved with a copyField, right? So my question is: is it possible to feed the solr index with the ouput of the analyzer? Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/Feed-index-with-analyzer-output-tp3131771p3131771.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Reading data from Solr MoreLikeThis
Hi, As far as I know, there's no specific method to get the MoreLikeThis section from the response. Anyway, you can retrieve the results with a piece of code like the following: // the lst name=moreLikeThis is a NamedList of SolrDocumentLists NamedListSolrDocumentList mltResult = (NamedListSolrDocumentList)response.getResponse().get(moreLikeThis); for(Map.EntryString, SolrDocumentList entry: mltResult) { System.out.println(Docs similar to + entry.getKey()); for(SolrDocument similarDoc: entry.getValue()) { System.out.println( - + similarDoc.get(id)); } } Hope that helps! *Juan* On Fri, Jul 1, 2011 at 3:04 PM, Sheetal rituzprad...@gmail.com wrote: Hi, I am beginner in Solr. I am trying to read data from Solr MoreLike This through Java. My query is http://localhost:8983/solr/select?q=repository_id:20mlt=truemlt.fl=filenamemlt.mindf=1mlt.mintf=1debugQuery=onmlt.interestingTerms=detail I wanted to read the data of the field moreLikeThis from output lst name=moreLikeThis. The main idea is, after I do moreLikeThis, then all fieldValue of moreLikeThis should print out in my program. I figured out the way to read the Result tag by doing QueryResponse rsp.getResults() and looping out. But How would I read and print the values of moreLikeThis tag? Is there anyway class like rsp.getMoreLikeThisField(fieldname) or something. Thank you in advance. :) -- View this message in context: http://lucene.472066.n3.nabble.com/Reading-data-from-Solr-MoreLikeThis-tp3130184p3130184.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Regex replacement not working!
Hi Samuele, It's not clear for me if your goal is to search on that field (for example, salary_min:[100 TO 200]) or if you want to show the transformed field to the user (so you want the result of the regex replacement to be included in the search results). If your goal is to show the results to the user, then (as Ahmet said in a previous mail) it won't work, because the content of the documents is stored verbatim. The analysis only affects the way that documents are searched. If your goal is to search, could you please show us the query that you're using to test the use case? Thanks! *Juan* On Wed, Jun 29, 2011 at 10:02 AM, samuele.mattiuzzo samum...@gmail.comwrote: ok, but i'm not applying the filtering on the copyfields. this is how my schema looks: field name=salary type=text indexed=true stored=true / field name=salary_min type=salary_min_text indexed=true stored=true / field name=salary_max type=salary_max_text indexed=true stored=true / copyField source=salary dest=salary_min / copyField source=salary dest=salary_max / and the two datatypes defined before. that's why i tought i could first use copyField to copy the value then index them with my two datatypes filtering... -- View this message in context: http://lucene.472066.n3.nabble.com/Regex-replacement-not-working-tp3120748p3121497.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: tika and solr 3,1 integration
Hi Naveen, Check if there is a dynamic field named attr_* in the schema. The uprefix=attr_ parameter means that if Solr can't find an extracted field in the schema, it'll add the prefix attr_ and try again. *Juan* On Thu, Jun 2, 2011 at 4:21 AM, Naveen Gupta nkgiit...@gmail.com wrote: Hi I am trying to integrate solr 3.1 and tika (which comes default with the version) and using curl command trying to index few of the documents, i am getting this error. the error is attr_meta field is unknown. i checked the solrconfig, it looks perfect to me. can you please tell me what i am missing. I copied all the jars from contrib/extraction/lib to solr/lib folder that is there in same place where conf is there I am using the same request handler which is coming with default requestHandler name=/update/extract startup=lazy class=solr.extraction.ExtractingRequestHandler lst name=defaults !-- All the main content goes into text... if you need to return the extracted text or do highlighting, use a stored field. -- str name=fmap.contenttext/str str name=lowernamestrue/str str name=uprefixignored_/str !-- capture link hrefs but ignore div attributes -- str name=captureAttrtrue/str str name=fmap.alinks/str str name=fmap.divignored_/str /lst /requestHandler * curl http://dev.grexit.com:8080/solr1/update/extract?literal.id=who.pdfuprefix=attr_attr_fmap.content=attr_contentcommit=true -F myfile=@/root/apache-solr-3.1.0/docs/who.pdf* htmlheadtitleApache Tomcat/6.0.18 - Error report/titlestyle!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--/style /headbodyh1HTTP Status 400 - ERROR:unknown field 'attr_meta'/h1HR size=1 noshade=noshadepbtype/b Status report/ppbmessage/b uERROR:unknown field 'attr_meta'/u/ppbdescription/b uThe request sent by the client was syntactically incorrect (ERROR:unknown field 'attr_meta')./u/pHR size=1 noshade=noshadeh3Apache Tomcat/6.0.18/h3/body/htmlroot@weforpeople:/usr/share/solr1/lib# Please note i integrated apacha tika 0.9 with apache-solr-1.4 locally on windows machine and using solr cell calling the program works fine without any changes in configuration. Thanks Naveen
Re: Sorting
Hi Clécio, Your problem may be caused by case sensitiveness of string fields. Try using the lowercase field type that comes in the example. Regards, *Juan* On Thu, Jun 2, 2011 at 6:13 PM, Clecio Varjao cleciovar...@gmail.comwrote: Hi, When using the following URL: http://localhost:8080/solr/StatReg/select?version=2.2sort=path+ascfl=pathstart=0q=paths%3A%222%2Froot%2FStatReg%2F--+C+--%22hl=offrows=500 I get the result in the following order: [...] /-- C --/Community Care Facility Act [RSBC 1996] c. 60/00_96060REP_01.xml /-- C --/Community Care and Assisted Living Act [SBC 2002] c. 75/00_02075_01.xml [...] However, the order is not right and Assisted should come before Facitity Act. I'm using the following schema configuration: fieldtype name=string class=solr.StrField sortMissingLast=true omitNorms=true/ field name=path type=string indexed=true stored=true multiValued=false / Thanks, Clécio
Re: Lazy loading error in Extracting request handler
Hi Vignesh, Are you working from the provided example? If not, did you copy the solr-cell libraries to your Solr deployment? You can follow the instructions here: http://wiki.apache.org/solr/ExtractingRequestHandler#Configuration Regards, *Juan* On Tue, Apr 19, 2011 at 3:47 AM, Vignesh Raj vignesh...@greatminds.co.inwrote: Hi, I am new to Solr and its configuration. I need to index some pdf files and for that reason I thought of using the extractingrequesthandler. I use Apache tomcat and run solr from it. I used the following command to index a pdf file. C:\Users\vikky\Downloads\curl-7.21.0-win64-nosslcurl.exe http://localhost:8983/solr/update/extract?literal.id=doc4uprefix=attr_fma p.content=attr_contentcommit=true -F myfile=@G:\Official\Archiving\apache-solr-1.4.1\apache-solr-1.4.1\docs\featu res.pdf But, I got the following error. htmlheadtitleApache Tomcat/6.0.32 - Error report/titlestyle!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;fo nt-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;fo nt-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;fo nt-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size: 12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--/style /headbodyh1HTTP Status 500 - lazy loading error org.apache.solr.common.SolrException: lazy loading error at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHan dler(RequestHandlers.java:249) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest (RequestHandlers.java:231) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:3 38) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java: 241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Application FilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterCh ain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.ja va:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.ja va:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127 ) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102 ) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java :109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http 11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Unknown Source) Caused by: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.extraction.ExtractingRequestHandler' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:37 5) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHan dler(RequestHandlers.java:240) ... 16 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.handler.extraction.ExtractingRequestHandler at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.net.FactoryURLClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:35 9) ... 19 more /h1HR size=1 noshade=noshadepbtype/b Status report/ppbmessage/b ulazy loading error org.apache.solr.common.SolrException: lazy loading error at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHan dler(RequestHandlers.java:249) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest (RequestHandlers.java:231) at
Re: QUESTION: SOLR INDEX BIG FILE SIZES
I'm sorry, you're right, I was thinking in the 2GB default value for maxMergeMB. *Juan* On Mon, Apr 18, 2011 at 3:16 PM, Burton-West, Tom tburt...@umich.eduwrote: As far as I know, Solr will never arrive to a segment file greater than 2GB, so this shouldn't be a problem. Solr can easily create a file size over 2GB, it just depends on how much data you index and your particular Solr configuration, including your ramBufferSizeMB, your mergeFactor, and whether you optimize. For example we index about a terabyte of full text and optimize our indexes so we have a 300GB *prx file. If you really have a filesystem limit of 2GB, there is a parameter called maxMergeMB in Solr 3.1 that you can set. Unfortunately it is the maximum size of a segment that will be merged rather than the maximum size of the resulting segment. So if you have a mergeFactor of 10 you could probably set it somewhere around (2GB / 10)= 200. Just to be cautious, you might want to set it to 100. mergePolicy class=org.apache.lucene.index.LogByteSizeMergePolicy double name=maxMergeMB200/double /mergePolicy In the flexible indexing branch/trunk there is a new merge policy and parameter that allows you to set the maximum size of the merged segment: https://issues.apache.org/jira/browse/LUCENE-854. Tom Burton-West http://www.hathitrust.org/blogs/large-scale-search -Original Message- From: Juan Grande [mailto:juan.gra...@gmail.com] Sent: Friday, April 15, 2011 5:15 PM To: solr-user@lucene.apache.org Subject: Re: QUESTION: SOLR INDEX BIG FILE SIZES Hi John, ¿How can split the file of the solr index into multiple files? Actually, the index is organized in a set of files called segments. It's not just a single file, unless you tell Solr to do so. That's because some file systems are about to support a maximun of space in a single file for example some UNIX file systems only support a maximun of 2GB per file. As far as I know, Solr will never arrive to a segment file greater than 2GB, so this shouldn't be a problem. ¿What is the recommended storage strategy for a big solr index files? I guess that it depends in the indexing/querying performance that you're having, the performance that you want, and what big exactly means for you. If your index is so big that individual queries take too long, sharding may be what you're looking for. To better understand the index format, you can see http://lucene.apache.org/java/3_1_0/fileformats.html Also, you can take a look at my blog (http://juanggrande.wordpress.com), in my last post I speak about segments merging. Regards, *Juan* 2011/4/15 JOHN JAIRO GÓMEZ LAVERDE jjai...@hotmail.com SOLR USER SUPPORT TEAM I have a quiestion about the maximun file size of solr index, when i have a lot of data in the solr index, -¿How can split the file of the solr index into multiple files? That's because some file systems are about to support a maximun of space in a single file for example some UNIX file systems only support a maximun of 2GB per file. -¿What is the recommended storage strategy for a big solr index files? Thanks for the reply. JOHN JAIRO GÓMEZ LAVERDE Bogotá - Colombia - South America
Re: QUESTION: SOLR INDEX BIG FILE SIZES
Hi John, ¿How can split the file of the solr index into multiple files? Actually, the index is organized in a set of files called segments. It's not just a single file, unless you tell Solr to do so. That's because some file systems are about to support a maximun of space in a single file for example some UNIX file systems only support a maximun of 2GB per file. As far as I know, Solr will never arrive to a segment file greater than 2GB, so this shouldn't be a problem. ¿What is the recommended storage strategy for a big solr index files? I guess that it depends in the indexing/querying performance that you're having, the performance that you want, and what big exactly means for you. If your index is so big that individual queries take too long, sharding may be what you're looking for. To better understand the index format, you can see http://lucene.apache.org/java/3_1_0/fileformats.html Also, you can take a look at my blog (http://juanggrande.wordpress.com), in my last post I speak about segments merging. Regards, *Juan* 2011/4/15 JOHN JAIRO GÓMEZ LAVERDE jjai...@hotmail.com SOLR USER SUPPORT TEAM I have a quiestion about the maximun file size of solr index, when i have a lot of data in the solr index, -¿How can split the file of the solr index into multiple files? That's because some file systems are about to support a maximun of space in a single file for example some UNIX file systems only support a maximun of 2GB per file. -¿What is the recommended storage strategy for a big solr index files? Thanks for the reply. JOHN JAIRO GÓMEZ LAVERDE Bogotá - Colombia - South America
Re: Error on string searching # [STRANGE]
I think that the problem is with the # symbol, because it has a special meaning when used inside a URL. Try replacing it with %23, like this: http://192.168.3.3:8983/solr3.1/core0/select?q=myfield:(S.%23L.W.VI.37) Regards, * Juan G. Grande* -- Solr Consultant @ http://www.plugtree.com -- Blog @ http://juanggrande.wordpress.com On Thu, Mar 10, 2011 at 12:45 PM, Dario Rigolin dario.rigo...@comperio.itwrote: I have a text field indexed using WordDelimeter Indexed in that way doc field name=myfieldS.#L.W.VI.37/field ... /doc Serching in that way: http://192.168.3.3:8983/solr3.1/core0/select?q=myfield:(S.#L.W.VI.37) Makes this error: org.apache.lucene.queryParser.ParseException: Cannot parse 'myfield:(S.': Lexical error at line 1, column 17. Encountered: EOF after : \S. It seems that # is a wrong character for query... I try urlencoding o adding a slash before or removing quotes but other errors comes: http://192.168.3.3:8983/solr3.1/core0/select?q=myfield:(S.#L.W.VI.37) org.apache.lucene.queryParser.ParseException: Cannot parse 'myfield:(S.': Encountered EOF at line 1, column 15. Was expecting one of: AND ... OR ... NOT ... + ... - ... ( ... ) ... * ... ^ ... QUOTED ... TERM ... FUZZY_SLOP ... PREFIXTERM ... WILDTERM ... [ ... { ... NUMBER ... Any idea how to solve this? Maybe a bug? Or probably I'm missing something. Dario.
Re: List of indexed or stored fields
You can query all the indexed or stored fields (including dynamic fields) using the LukeRequestHandler: http://localhost:8983/solr/example/admin/luke See also: http://wiki.apache.org/solr/LukeRequestHandler Regards, * **Juan G. Grande* -- Solr Consultant @ http://www.plugtree.com -- Blog @ http://juanggrande.wordpress.com On Tue, Jan 25, 2011 at 12:39 PM, kenf_nc ken.fos...@realestate.com wrote: I use a lot of dynamic fields, so looking at my schema isn't a good way to see all the field names that may be indexed across all documents. Is there a way to query solr for that information? All field names that are indexed, or stored? Possibly a count by field name? Is there any other metadata about a field that can be queried? -- View this message in context: http://lucene.472066.n3.nabble.com/List-of-indexed-or-stored-fields-tp2330986p2330986.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: [POLL] Where do you get Lucene/Solr from? Maven? ASF Mirrors?
Where do you get your Lucene/Solr downloads from? [] ASF Mirrors (linked in our release announcements or via the Lucene website) [X] Maven repository (whether you use Maven, Ant+Ivy, Buildr, etc.) [X] I/we build them from source via an SVN/Git checkout. [] Other (someone in your company mirrors them internally or via a downstream project) Juan Grande
Re: lazy loading error?
In order to use the ExtractingRequestHandler, you have to first copy apache-solr-cell-version.jar and all the libraries from contrib/extraction/lib to a lib folder next to the conf folder of your instance. Also, check the URL because there is an ampersand missing. Regards, *Juan Grande* On Wed, Jan 19, 2011 at 7:43 AM, Jörg Agatz joerg.ag...@googlemail.comwrote: Hallo, i have a problem with Solr and it looks like RequestHandlers.. but i dont know what i must do... i have remove and reinstall Openjdk installt maven2 and tika, nothing Chane.. someware in idea for me? Command: curl http://192.168.105.210:8080/solr/rechnungen/update/extract?literal.id=1234567uprefix=attr_commit=true -F myfile=@test.xls EROOR: htmlheadtitleApache Tomcat/6.0.24 - Error report/titlestyle!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--/style /headbodyh1HTTP Status 500 - lazy loading error org.apache.solr.common.SolrException: lazy loading error at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:249) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:636) Caused by: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.extraction.ExtractingRequestHandler' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:413) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:449) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:240) ... 16 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.handler.extraction.ExtractingRequestHandler at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:615) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:359) ... 19 more /h1HR size=1 noshade=noshadepbtype/b Status report/ppbmessage/b ulazy loading error org.apache.solr.common.SolrException: lazy loading error at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:249) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:231) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java
Re: spell suggest response
It isn't exactly what you want, but did you try with the onlyMorePopular parameter? http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.onlyMorePopular Regards, Juan Grande On Wed, Jan 12, 2011 at 7:29 AM, satya swaroop satya.yada...@gmail.comwrote: Hi stefan, I need the words from the index record itself. If java is given then the relevant or similar or near words in the index should be shown. Even the given keyword is true... can it be possible??? ex:- http://localhost:8080/solr/spellcheckCompRH?q=javarows=0spellcheck=truespellcheck.count=10 In the o/p the suggestions will not be coming as java is a word that spelt correctly... But cant we get near suggestions as javax,javacetc.., ???(the terms in the index) I read about suggester in solr wiki at http://wiki.apache.org/solr/Suggester . But i tried to implement it but got errors as *error loading class org.apache.solr.spelling.suggest.suggester* Regards, satya
Re: Term frequency across multiple documents
Maybe there is a better solution, but I think that you can solve this problem using facets. You will get the number of documents where each term appears. Also, you can filter a specific set of terms by entering a query like +field:term1 OR +field:term2 OR ..., or using the facet.query parameter. Regards, Juan Grande On Wed, Jan 12, 2011 at 11:08 AM, Aaron Bycoffe abyco...@sunlightfoundation.com wrote: I'm attempting to calculate term frequency across multiple documents in Solr. I've been able to use TermVectorComponent to get this data on a per-document basis but have been unable to find a way to do it for multiple documents -- that is, get a list of terms appearing in the documents and how many times each one appears. I'd also like to be able to filter the list of terms to be able to see how many times a specific term appears, though this is less important. Is there a way to do this in Solr? Aaron
Re: Sorting within grouped results?
Did you try adding the parameter group.sort=popularity+desc to the URL? I think that's what you want, according to http://wiki.apache.org/solr/FieldCollapsing. Good luck Juan Grande On Thu, Jan 6, 2011 at 3:30 AM, Andy angelf...@yahoo.com wrote: I want to group my results by a field named group_id. According to http://wiki.apache.org/solr/FieldCollapsing , for each unique value of group_id a docList with the top scoring document is returned. But in my case I want to sort the results within each group_id by an int field popularity instead. So within each group_id I just want the document with the highest popularity. Is it possible to do that? Thanks.
Re: Searching similar values for same field results in different results
You have a problem with the analysis chain. When you do a query, the EnglishPorterFilter is cutting off the last part of your word, but you're not doing the same when indexing. I think that removing that filter from the chain will solve your problem. Remember that there are two different analysis chains, one for indexing time and one for querying time. I think that you didn't see the shortened word in analysis.jsp because you entered the text in the Field Value (Index) text box, so it was using the indexing time analysis chain. If you want to see the results of applying the querying time analysis chain, you should enter the text in the Field Value (Query) text box. Good luck, Juan Grande On Thu, Jan 6, 2011 at 10:58 AM, PeterKerk vettepa...@hotmail.com wrote: @iorixxx: I ran: http://localhost:8983/solr/db/update/?optimize=true This is the response: response lst name=responseHeader int name=status0/int int name=QTime58/int /lst /response Then I ran: http://localhost:8983/solr/db/select/?indent=onfacet=onq=*:*facet.field=themes_raw This is response: lst name=facet_fields lst name=themes_raw int name=Hotel en Restaurant366/int int name=Kasteel en Landgoed153/int int name=Strand en Zee16/int /lst /lst So, it seems that nothing has changed there, and it looks like also before the optimize operation the results were shown correct? when you say http caching, you mean the caching by the browser? Or does Solr have some caching by default? If the latter, how can I clear that cache? @Erick: I added debugquery For Strand en Zee I see this: arr name=parsed_filter_queries strPhraseQuery(themes:strand en zee)/str /arr Looks correct. For Kasteel en Landgoed I see this: arr name=parsed_filter_queries strPhraseQuery(themes:kasteel en landgo)/str /arr Which isnt correct! So it seems herein lies the problem. Now Im wondering why the value is cut off...this is my schema.xml: fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_dutch.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords_dutch.txt/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.EnglishPorterFilterFactory protected=protwords.txt/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType field name=themes type=text indexed=true stored=true multiValued=true / field name=themes_raw type=string indexed=true stored=true multiValued=true/ I checked analysis.jsp: filled in Field: themes and Field value: Kasteel en Landgoed and schema.jsp, but I didnt see any weird results Now, Im wondering what else it could be.. -- View this message in context: http://lucene.472066.n3.nabble.com/Searching-similar-values-for-same-field-results-in-different-results-tp2199269p2205706.html Sent from the Solr - User mailing list archive at Nabble.com.