Re: Quering the database
Thanks alot to all now its clear the problem was in the schema. One more thing i would like to know is if the user queries for something does it have to always be like q=field:monitor where field is defined in schema and monitor is just a text in a column. Hando -- View this message in context: http://lucene.472066.n3.nabble.com/Quering-the-database-tp1015636p1018268.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Quering the database
No. With Solr is really flexible and allows for a lot of complex querying out-of-the-box. Really the Wiki is your best friend here. http://wiki.apache.org/solr/ perhaps start with: 1. http://lucene.apache.org/solr/tutorial.html 2. http://wiki.apache.org/solr/SolrQuerySyntax 3. http://wiki.apache.org/solr/QueryParametersIndex (list of some standard parameters with link to their function/use) -- especially look at the 'fq'-param which is aanother way to limit your result-set. and just browse the wiki starting from the homepage for the rest. It should pretty quickly give you some an overview of what's possible. cheers, Geert-Jan http://lucene.apache.org/solr/tutorial.html 2010/8/3 Hando420 hando...@gmail.com Thanks alot to all now its clear the problem was in the schema. One more thing i would like to know is if the user queries for something does it have to always be like q=field:monitor where field is defined in schema and monitor is just a text in a column. Hando -- View this message in context: http://lucene.472066.n3.nabble.com/Quering-the-database-tp1015636p1018268.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr query result cache size and expire property
Hi all! please help - how can I calculate queryresultcache size (how much RAM should be dedicated for that). I have 1,5 index size, 4 mio docs. QueryResultWindowSize is 20. Could I use expire property on the documents in this cache? regards, Stanislaw
Highlighting, return the matched terms only
Hi, how could I have the highlighting component return only the terms that were matched, without any surrounding text ?
Re: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
Hooray, im a bit further now... Turned out that both SQL2005 and 2008 were running, where 2005 was listening on port 1433. I disabled that services, enabled the 2008 service and now I can connect and the command http://localhost:8983/solr/db/dataimport?command=full-import is successfull. But now I want to SEARCH in those indexed documents via http://localhost:8983/solr/db/admin/ See here the result of the query *:* response − lst name=responseHeader int name=status0/int int name=QTime0/int − lst name=params str name=indenton/str str name=start0/str str name=q*:*/str str name=version2.2/str str name=rows10/str /lst /lst − result name=response numFound=3 start=0 − doc str name=id1/str date name=timestamp2010-08-03T11:43:38.905Z/date str name=titleGemeentehuis Nijmegen/str /doc − doc str name=id2/str date name=timestamp2010-08-03T11:43:38.936Z/date str name=titleGemeentehuis Utrecht/str /doc − doc str name=id3/str date name=timestamp2010-08-03T11:43:38.936Z/date str name=titleBeachclub Vroeger/str /doc /result /response This is my data-config: dataConfig dataSource driver=com.microsoft.sqlserver.jdbc.SQLServerDriver url=jdbc:sqlserver://localhost:1433;databaseName=wedding user=sa password=123456 / document name=weddinglocations entity name=location query=select * from locations field column=ID name=id / field column=TITLE name=title / field column=CITY name=city / entity name=location_feature query=select FEATUREID from location_features where locationid='${location.ID}' entity name=feature query=select title from features where id = '${location_feature.FEATUREID}' field name=features column=title / /entity /entity entity name=location_theme query=select THEMEID from location_themes where locationid='${location.ID}' entity name=theme query=select title from features where id = '${location_theme.FEATUREID}' field name=cat column=title / /entity /entity /entity /document /dataConfig Why dont I see any data with regard to themes or features? Regards, Pete -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-data-on-MSSQL-failed-Caused-by-org-apache-solr-common-SolrException-Error-loading-class-com-tp1015137p1018845.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: want to display elevated results on my display result screen differently.
Suppose i have elevate.xml file and i elevate the ID :- Artist:11650 and Artist:510 when i search for corgan this is elevate File elevate query text=corgan doc id=Artist:11650 /!-- the Smashing Pumpkins -- doc id=Artist:510 /!-- Green Day -- doc id=Artist:35656 exclude=true /!-- Starchildren -- /query !-- others queries... -- /elevate Is there any way (query parameter) which give us clue which ids are elevated when actual search done for corgan When we search than the result xml structure is same as normal search without elevation. I want to display elevated results on my display result screen differently. -- View this message in context: http://lucene.472066.n3.nabble.com/Show-elevated-Result-Differently-tp1002081p1018879.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Quering the database
Thanks these links were useful. Managed to figured out what i needed. Cheers Hando -- View this message in context: http://lucene.472066.n3.nabble.com/Quering-the-database-tp1015636p1018896.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
We'd need to see your schema.xml file for that; it's probably something in your field types. -Original Message- From: PeterKerk [mailto:vettepa...@hotmail.com] Sent: Tuesday, August 03, 2010 7:49 AM To: solr-user@lucene.apache.org Subject: Re: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver' Hooray, im a bit further now... Turned out that both SQL2005 and 2008 were running, where 2005 was listening on port 1433. I disabled that services, enabled the 2008 service and now I can connect and the command http://localhost:8983/solr/db/dataimport?command=full-import is successfull. But now I want to SEARCH in those indexed documents via http://localhost:8983/solr/db/admin/ See here the result of the query *:* response − lst name=responseHeader int name=status0/int int name=QTime0/int − lst name=params str name=indenton/str str name=start0/str str name=q*:*/str str name=version2.2/str str name=rows10/str /lst /lst − result name=response numFound=3 start=0 − doc str name=id1/str date name=timestamp2010-08-03T11:43:38.905Z/date str name=titleGemeentehuis Nijmegen/str /doc − doc str name=id2/str date name=timestamp2010-08-03T11:43:38.936Z/date str name=titleGemeentehuis Utrecht/str /doc − doc str name=id3/str date name=timestamp2010-08-03T11:43:38.936Z/date str name=titleBeachclub Vroeger/str /doc /result /response This is my data-config: dataConfig dataSource driver=com.microsoft.sqlserver.jdbc.SQLServerDriver url=jdbc:sqlserver://localhost:1433;databaseName=wedding user=sa password=123456 / document name=weddinglocations entity name=location query=select * from locations field column=ID name=id / field column=TITLE name=title / field column=CITY name=city / entity name=location_feature query=select FEATUREID from location_features where locationid='${location.ID}' entity name=feature query=select title from features where id = '${location_feature.FEATUREID}' field name=features column=title / /entity /entity entity name=location_theme query=select THEMEID from location_themes where locationid='${location.ID}' entity name=theme query=select title from features where id = '${location_theme.FEATUREID}' field name=cat column=title / /entity /entity /entity /document /dataConfig Why dont I see any data with regard to themes or features? Regards, Pete -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-data-on-MSSQL-failed-Caused-by-org-apache-solr-common-SolrException-Error-loading-class-com-tp1015137p1018845.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
Thanks for the quick reply :) Here it is: ?xml version=1.0 encoding=UTF-8 ? !-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -- !-- This is the Solr schema file. This file should be named schema.xml and should be in the conf directory under the solr home (i.e. ./solr/conf/schema.xml by default) or located where the classloader for the Solr webapp can find it. This example schema is the recommended starting point for users. It should be kept correct and concise, usable out-of-the-box. For more information, on how to customize this file, please see http://wiki.apache.org/solr/SchemaXml -- schema name=db version=1.1 !-- attribute name is the name of this schema and is only used for display purposes. Applications should change this to reflect the nature of the search collection. version=1.1 is Solr's version number for the schema syntax and semantics. It should not normally be changed by applications. 1.0: multiValued attribute did not exist, all fields are multiValued by nature 1.1: multiValued attribute introduced, false by default -- types !-- field type definitions. The name attribute is just a label to be used by field definitions. The class attribute and any other attributes determine the real behavior of the fieldType. Class names starting with solr refer to java classes in the org.apache.solr.analysis package. -- !-- The StrField type is not analyzed, but indexed/stored verbatim. - StrField and TextField support an optional compressThreshold which limits compression (if enabled in the derived fields) to values which exceed a certain size (in characters). -- fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ !-- boolean type: true or false -- fieldType name=boolean class=solr.BoolField sortMissingLast=true omitNorms=true/ !-- The optional sortMissingLast and sortMissingFirst attributes are currently supported on types that are sorted internally as strings. - If sortMissingLast=true, then a sort on this field will cause documents without the field to come after documents with the field, regardless of the requested sort order (asc or desc). - If sortMissingFirst=true, then a sort on this field will cause documents without the field to come before documents with the field, regardless of the requested sort order. - If sortMissingLast=false and sortMissingFirst=false (the default), then default lucene sorting will be used which places docs without the field first in an ascending sort and last in a descending sort. -- !-- numeric field types that store and index the text value verbatim (and hence don't support range queries, since the lexicographic ordering isn't equal to the numeric ordering) -- fieldType name=integer class=solr.IntField omitNorms=true/ fieldType name=long class=solr.LongField omitNorms=true/ fieldType name=float class=solr.FloatField omitNorms=true/ fieldType name=double class=solr.DoubleField omitNorms=true/ !-- Numeric field types that manipulate the value into a string value that isn't human-readable in its internal form, but with a lexicographic ordering the same as the numeric ordering, so that range queries work correctly. -- fieldType name=sint class=solr.SortableIntField sortMissingLast=true omitNorms=true/ fieldType name=slong class=solr.SortableLongField sortMissingLast=true omitNorms=true/ fieldType name=sfloat class=solr.SortableFloatField sortMissingLast=true omitNorms=true/ fieldType name=sdouble class=solr.SortableDoubleField sortMissingLast=true omitNorms=true/ !-- The format for this date field is of the form 1995-12-31T23:59:59Z, and is a more restricted form of the canonical representation of dateTime http://www.w3.org/TR/xmlschema-2/#dateTime The trailing Z designates UTC time and is mandatory. Optional fractional seconds are allowed: 1995-12-31T23:59:59.999Z All other components are mandatory. Expressions can also be used to denote
query about qf defaults
Hi, I have in my solr config file the code below to create a default for fq which works great. The problem I have is if I want to use a custom fq this one gets overwritten. Is there a way I can have it keep this fq and other custom ones? Basically this field sets if the person is to show up or not so it's important anyone set to d is never shown regardless of any other query filters. lst name=defaults str name=fqss_cck_field_status:d /str thanks in advance for any help Robert
Re: want to display elevated results on my display result screen differently.
Have you looked at the relevance scores? I would speculate elevate matches would have constant, high score. On 8/3/10, Vishal.Arora vis...@value-one.com wrote: Suppose i have elevate.xml file and i elevate the ID :- Artist:11650 and Artist:510 when i search for corgan this is elevate File elevate query text=corgan doc id=Artist:11650 /!-- the Smashing Pumpkins -- doc id=Artist:510 /!-- Green Day -- doc id=Artist:35656 exclude=true /!-- Starchildren -- /query !-- others queries... -- /elevate Is there any way (query parameter) which give us clue which ids are elevated when actual search done for corgan When we search than the result xml structure is same as normal search without elevation. I want to display elevated results on my display result screen differently. -- View this message in context: http://lucene.472066.n3.nabble.com/Show-elevated-Result-Differently-tp1002081p1018879.html Sent from the Solr - User mailing list archive at Nabble.com. -- Sent from my mobile device
Re: Queries with multiple wildcards failing in branch3x
Thanks, I updated to the latest version with the fix but I'm now getting another error when optimizing the index (or when searching certain fields). It mentions unknown compression method but I'm not using compressed fields at all. SEVERE: java.io.IOException: background merge hit exception: _a:C248670/19645 _l:C206701/14563 _m:C12186/100 _n:C11356 _o:C9945 _p:C9000 _q:C5704 _r:C2214 _s:C2000 _t:C1264 into _u [optimize] [mergeDocStores] at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2392) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2320) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:403) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:169) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.lucene.index.CorruptIndexException: field data are in wrong format: java.util.zip.DataFormatException: unknown compression method at org.apache.lucene.index.FieldsReader.uncompress(FieldsReader.java:585) at org.apache.lucene.index.FieldsReader.addField(FieldsReader.java:357) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:239) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:894) at org.apache.lucene.index.IndexReader.document(IndexReader.java:684) at org.apache.lucene.index.SegmentMerger.copyFieldsWithDeletions(SegmentMerger.java:410) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:338) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:159) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4053) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3647) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:339) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:407) Caused by: java.util.zip.DataFormatException: unknown compression method at java.util.zip.Inflater.inflateBytes(Native Method) at java.util.zip.Inflater.inflate(Inflater.java:238) at java.util.zip.Inflater.inflate(Inflater.java:256) at org.apache.lucene.document.CompressionTools.decompress(CompressionTools.java:106) at org.apache.lucene.index.FieldsReader.uncompress(FieldsReader.java:582) ... 11 more On Mon, Aug 2, 2010 at 6:04 PM, Michael McCandless luc...@mikemccandless.com wrote: This looks like the index corruption caused by a commit on Friday. See the thread I sent earlier with subject heads up -- index corruption on Solr/Lucene trunk/3.x branch. Mike On Mon, Aug 2, 2010 at 6:00 PM, Paul Dlug paul.d...@gmail.com wrote: I'm running a recent build of branch3x (r981609), queries with multiple wildcards (e.g. a*b*c*) are failing with the exception below in the log. These queries worked fine for me with solr 1.4, known bug? SEVERE: java.lang.IndexOutOfBoundsException: Index: 114, Size: 39 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:285) at
Parsing xml of the search results
The default style of search results is raw xml. If i want to make it more user friendly do i have to add stylesheet to the xml search response tags. Thanks. Hando -- View this message in context: http://lucene.472066.n3.nabble.com/Parsing-xml-of-the-search-results-tp1019017p1019017.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Parsing xml of the search results
You can use xsl. http://wiki.apache.org/solr/XsltResponseWriter Robert - Original Message - From: Hando420 Sent: 03/08/10 02:59 PM To: solr-user@lucene.apache.org Subject: Parsing xml of the search results The default style of search results is raw xml. If i want to make it more user friendly do i have to add stylesheet to the xml search response tags. Thanks. Hando -- View this message in context: http://lucene.472066.n3.nabble.com/Parsing-xml-of-the-search-results-tp1019017p1019017.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Queries with multiple wildcards failing in branch3x
Ugh... I think there may still be a bug lurking. Karl is also still having problems, much further into his indexing process. I'm hunting it now!! For the time being, I just disabled (committed to trunk 3x) the optimization that's causing the bug. Can you update to 3x head (or trunk head), remove your current index, and try again? Mike On Tue, Aug 3, 2010 at 8:52 AM, Paul Dlug paul.d...@gmail.com wrote: Thanks, I updated to the latest version with the fix but I'm now getting another error when optimizing the index (or when searching certain fields). It mentions unknown compression method but I'm not using compressed fields at all. SEVERE: java.io.IOException: background merge hit exception: _a:C248670/19645 _l:C206701/14563 _m:C12186/100 _n:C11356 _o:C9945 _p:C9000 _q:C5704 _r:C2214 _s:C2000 _t:C1264 into _u [optimize] [mergeDocStores] at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2392) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2320) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:403) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:169) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.lucene.index.CorruptIndexException: field data are in wrong format: java.util.zip.DataFormatException: unknown compression method at org.apache.lucene.index.FieldsReader.uncompress(FieldsReader.java:585) at org.apache.lucene.index.FieldsReader.addField(FieldsReader.java:357) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:239) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:894) at org.apache.lucene.index.IndexReader.document(IndexReader.java:684) at org.apache.lucene.index.SegmentMerger.copyFieldsWithDeletions(SegmentMerger.java:410) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:338) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:159) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4053) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3647) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:339) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:407) Caused by: java.util.zip.DataFormatException: unknown compression method at java.util.zip.Inflater.inflateBytes(Native Method) at java.util.zip.Inflater.inflate(Inflater.java:238) at java.util.zip.Inflater.inflate(Inflater.java:256) at org.apache.lucene.document.CompressionTools.decompress(CompressionTools.java:106) at org.apache.lucene.index.FieldsReader.uncompress(FieldsReader.java:582) ... 11 more On Mon, Aug 2, 2010 at 6:04 PM, Michael McCandless luc...@mikemccandless.com wrote: This looks like the index corruption caused by a commit on Friday. See the thread I sent earlier with subject heads up -- index corruption on Solr/Lucene trunk/3.x branch. Mike On Mon, Aug 2, 2010 at 6:00 PM, Paul Dlug paul.d...@gmail.com wrote: I'm running a recent build of branch3x
Re: query about qf defaults
You can use appends for any additional fq paramters, which would be appended to the ones passed @ query time. Check out the sample solrconfig.xml with the solr. !-- In addition to defaults, appends params can be specified to identify values which should be appended to the list of multi-val params from the query (or the existing defaults). In this example, the param fq=instock:true will be appended to any query time fq params the user may specify, as a mechanism for partitioning the index, independent of any user selected filtering that may also be desired (perhaps as a result of faceted searching). NOTE: there is *absolutely* nothing a client can do to prevent these appends values from being used, so don't use this mechanism unless you are sure you always want it. -- lst name=appends str name=fqinStock:true/str /lst Regards, Jayendra On Tue, Aug 3, 2010 at 8:25 AM, Robert Neve robert.n...@gmx.co.uk wrote: Hi, I have in my solr config file the code below to create a default for fq which works great. The problem I have is if I want to use a custom fq this one gets overwritten. Is there a way I can have it keep this fq and other custom ones? Basically this field sets if the person is to show up or not so it's important anyone set to d is never shown regardless of any other query filters. lst name=defaults str name=fqss_cck_field_status:d /str thanks in advance for any help Robert
Re: Parsing xml of the search results
Thanks man this helped. Hando -- View this message in context: http://lucene.472066.n3.nabble.com/Parsing-xml-of-the-search-results-tp1019017p1019064.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Queries with multiple wildcards failing in branch3x
Sure, I'm reindexing now, I'll let you know how it goes. --Paul On Tue, Aug 3, 2010 at 9:05 AM, Michael McCandless luc...@mikemccandless.com wrote: Ugh... I think there may still be a bug lurking. Karl is also still having problems, much further into his indexing process. I'm hunting it now!! For the time being, I just disabled (committed to trunk 3x) the optimization that's causing the bug. Can you update to 3x head (or trunk head), remove your current index, and try again? Mike On Tue, Aug 3, 2010 at 8:52 AM, Paul Dlug paul.d...@gmail.com wrote: Thanks, I updated to the latest version with the fix but I'm now getting another error when optimizing the index (or when searching certain fields). It mentions unknown compression method but I'm not using compressed fields at all. SEVERE: java.io.IOException: background merge hit exception: _a:C248670/19645 _l:C206701/14563 _m:C12186/100 _n:C11356 _o:C9945 _p:C9000 _q:C5704 _r:C2214 _s:C2000 _t:C1264 into _u [optimize] [mergeDocStores] at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2392) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2320) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:403) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:169) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.lucene.index.CorruptIndexException: field data are in wrong format: java.util.zip.DataFormatException: unknown compression method at org.apache.lucene.index.FieldsReader.uncompress(FieldsReader.java:585) at org.apache.lucene.index.FieldsReader.addField(FieldsReader.java:357) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:239) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:894) at org.apache.lucene.index.IndexReader.document(IndexReader.java:684) at org.apache.lucene.index.SegmentMerger.copyFieldsWithDeletions(SegmentMerger.java:410) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:338) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:159) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4053) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3647) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:339) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:407) Caused by: java.util.zip.DataFormatException: unknown compression method at java.util.zip.Inflater.inflateBytes(Native Method) at java.util.zip.Inflater.inflate(Inflater.java:238) at java.util.zip.Inflater.inflate(Inflater.java:256) at org.apache.lucene.document.CompressionTools.decompress(CompressionTools.java:106) at org.apache.lucene.index.FieldsReader.uncompress(FieldsReader.java:582) ... 11 more On Mon, Aug 2, 2010 at 6:04 PM, Michael McCandless luc...@mikemccandless.com wrote: This looks like the index corruption caused by a commit on Friday. See the thread I sent earlier with subject heads up -- index
Re: Parsing xml of the search results
What do you mean by user friendly? If you want an actual end user search interface, it now comes out of the box on both the trunk and 3_x branch. Fire up the example, index the example data, and go to / browse That UI is generated using the Velocity response writer. You can get Solr's response in a number of formats, such as Ruby, JSON, Python, PHP, XSLT'd, via Velocity templates, and more. Erik On Aug 3, 2010, at 8:59 AM, Hando420 wrote: The default style of search results is raw xml. If i want to make it more user friendly do i have to add stylesheet to the xml search response tags. Thanks. Hando -- View this message in context: http://lucene.472066.n3.nabble.com/Parsing-xml-of-the-search-results-tp1019017p1019017.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Parsing xml of the search results
User friendly i meant that the xml is raw and not user friendly for reading. So i wondered what is the best approach for showing the results in a more readable way. I got replied to this post but your comments were also valuable. Thanks. Cheers Hando -- View this message in context: http://lucene.472066.n3.nabble.com/Parsing-xml-of-the-search-results-tp1019017p1019120.html Sent from the Solr - User mailing list archive at Nabble.com.
Importing CSV with post.jar
Instructions on importing csv file with post.jar, to a custom URL: Had some problems finding this information - on forums - so decided to post this. STEP #1: Import file java -Durl=http://localhost:8983/solr/clients_ib/update/csv -Dcommit=no -jar post.jar data.csv STEP #2: Commit java -Dcommit=yes -Durl=http://localhost:8983/solr/clients_ib/update -jar post.jar Kind regards, Vladimir Sutskever Investment Bank - Technology JPMorgan Chase, Inc. This email is confidential and subject to important disclaimers and conditions including on offers for the purchase or sale of securities, accuracy and completeness of information, viruses, confidentiality, legal privilege, and legal entity disclaimers, available at http://www.jpmorgan.com/pages/disclosures/email.
RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
I can't see an obvious error, I'm afraid. Check the index with Luke: http://code.google.com/p/luke/ ... take particular note of the terms that are actually indexed, and what values each document has. You can also perform searches/etc; it's a useful tool. If the data isn't there, there's some problem... probably in dataconfig/schema relationship. Hard to say without knowing more. -Original Message- From: PeterKerk [mailto:vettepa...@hotmail.com] Sent: Tuesday, August 03, 2010 8:12 AM To: solr-user@lucene.apache.org Subject: Re: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver' Thanks for the quick reply :) Here it is: ?xml version=1.0 encoding=UTF-8 ? !-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -- !-- This is the Solr schema file. This file should be named schema.xml and should be in the conf directory under the solr home (i.e. ./solr/conf/schema.xml by default) or located where the classloader for the Solr webapp can find it. This example schema is the recommended starting point for users. It should be kept correct and concise, usable out-of-the-box. For more information, on how to customize this file, please see http://wiki.apache.org/solr/SchemaXml -- schema name=db version=1.1 !-- attribute name is the name of this schema and is only used for display purposes. Applications should change this to reflect the nature of the search collection. version=1.1 is Solr's version number for the schema syntax and semantics. It should not normally be changed by applications. 1.0: multiValued attribute did not exist, all fields are multiValued by nature 1.1: multiValued attribute introduced, false by default -- types !-- field type definitions. The name attribute is just a label to be used by field definitions. The class attribute and any other attributes determine the real behavior of the fieldType. Class names starting with solr refer to java classes in the org.apache.solr.analysis package. -- !-- The StrField type is not analyzed, but indexed/stored verbatim. - StrField and TextField support an optional compressThreshold which limits compression (if enabled in the derived fields) to values which exceed a certain size (in characters). -- fieldType name=string class=solr.StrField sortMissingLast=true omitNorms=true/ !-- boolean type: true or false -- fieldType name=boolean class=solr.BoolField sortMissingLast=true omitNorms=true/ !-- The optional sortMissingLast and sortMissingFirst attributes are currently supported on types that are sorted internally as strings. - If sortMissingLast=true, then a sort on this field will cause documents without the field to come after documents with the field, regardless of the requested sort order (asc or desc). - If sortMissingFirst=true, then a sort on this field will cause documents without the field to come before documents with the field, regardless of the requested sort order. - If sortMissingLast=false and sortMissingFirst=false (the default), then default lucene sorting will be used which places docs without the field first in an ascending sort and last in a descending sort. -- !-- numeric field types that store and index the text value verbatim (and hence don't support range queries, since the lexicographic ordering isn't equal to the numeric ordering) -- fieldType name=integer class=solr.IntField omitNorms=true/ fieldType name=long class=solr.LongField omitNorms=true/ fieldType name=float class=solr.FloatField omitNorms=true/ fieldType name=double class=solr.DoubleField omitNorms=true/ !-- Numeric field types that manipulate the value into a string value that isn't human-readable in its internal form, but with a lexicographic ordering the same as the numeric ordering, so that range queries work correctly. -- fieldType name=sint class=solr.SortableIntField sortMissingLast=true omitNorms=true/ fieldType name=slong
How to extend the BinaryResponseWriter imposed by Solrj
Hi, I'm trying to extend the writer used by solrj (org.apache.solr.response.BinaryResponseWriter), i have declared it in solrconfig.xml like this queryResponseWriter name=myWriter class=my.MyWriter I see that it is initialized, but when i try to set the 'wt' param to 'myWriter' solrQuery.setParam(wt,myWriter), nothing happen, it's still using the 'javabin' writer. Any idea? Thanks marc
Re: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
Ok, downloaded luke, but how do I run it? Tried googling it, but no luck...do I need to put in in some folder? And is there anything with regard to the casing of the columns? For example, I now have select FEATUREID and field column=TITLE name=title / in my data-config However, my DB columns are all in lower casing...is solr sensitive to that in any way? -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-data-on-MSSQL-failed-Caused-by-org-apache-solr-common-SolrException-Error-loading-class-com-tp1015137p1019387.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
Sorry; I should have linked you to the self-executable .jar with no dependencies (so double-clicking runs): http://code.google.com/p/luke/downloads/detail?name=lukeall-1.0.1.jarcan=2q= Once you open it, you can open the index folder in your solr/data hierarchy. -Original Message- From: PeterKerk [mailto:vettepa...@hotmail.com] Sent: Tuesday, August 03, 2010 10:50 AM To: solr-user@lucene.apache.org Subject: Re: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver' Ok, downloaded luke, but how do I run it? Tried googling it, but no luck...do I need to put in in some folder? And is there anything with regard to the casing of the columns? For example, I now have select FEATUREID and field column=TITLE name=title / in my data-config However, my DB columns are all in lower casing...is solr sensitive to that in any way? -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-data-on-MSSQL-failed-Caused-by-org-apache-solr-common-SolrException-Error-loading-class-com-tp1015137p1019387.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Index compatibility 1.4 Vs 3.1 Trunk
Hello Mr.Hostetter, Thank you very much for the clarification. I do remember that when I first deployed the solr code from trunk on a test server I couldnt open the index (careted via 1.4) even via the solr admin page, It kept giving me corrupted index EOF kind of exception, so I was curious. Let me try it out again and report to you with the exact error. On Mon, Aug 2, 2010 at 4:28 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : I am trying to use the solr code from ' : https://svn.apache.org/repos/asf/lucene/dev/trunk' as my design warrants use : of PolyType fields. My understanding is that the indexes are incompatible, : am I right ?. I have about a million docs in my index (indexed via solr : 1.4). Is re-indexing my only option or is there a tool of some sort to : convert the 1.4 index to 3.1 format ? a) the trunk is what will ultimately be Solr 4.x, not 3.x ... for the 3.x line there is a 3x branch... http://wiki.apache.org/solr/Solr3.1 http://wiki.apache.org/solr/Solr4.0 b) The 3x branch can read indexes created by Solr 1.4 -- the first time you add a doc and commit the new segments wil automaticly be converted to the new format. I am fairly certian that as of this moment, the 4x trunk can also read indexes created by Solr 1.4, with the same automatic converstion taking place. c) If/When the trunk can no longer read Solr 1.4 indexes, there will be a tool provided for upgrading index versions. -Hoss
Error indexing date
Hi everybody, I'm having a error to index a xml file. I have configured my schema xml with all field names I needed, and created a xml with these fields, but in field wich I use some date I receive the following error: On my schema.xml the fields that I need date I configured it with * type=date* On my xml file the date is on this format:* 2010-07-31T13:37:35:999Z* Jul 28, 2010 11:40:44 PM org.apache.solr.common.SolrException log *SEVERE: org.apache.solr.common.SolrException: Invalid Date String:''* at org.apache.solr.schema.DateField.parseMath(DateField.java:163) at org.apache.solr.schema.TrieDateField.createField(TrieDateField.java:171) at org.apache.solr.schema.SchemaField.createField(SchemaField.java:94) at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:246) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:619) Somebody could help me? Tks!! -- Claudio Devecchi
RE: Error indexing date
What's your XML data look like (for the data)? Looks like it's not the same date format Solr accepts. -Original Message- From: Claudio Devecchi [mailto:cdevec...@gmail.com] Sent: Tuesday, August 03, 2010 11:16 AM To: solr-user@lucene.apache.org Subject: Error indexing date Hi everybody, I'm having a error to index a xml file. I have configured my schema xml with all field names I needed, and created a xml with these fields, but in field wich I use some date I receive the following error: On my schema.xml the fields that I need date I configured it with * type=date* On my xml file the date is on this format:* 2010-07-31T13:37:35:999Z* Jul 28, 2010 11:40:44 PM org.apache.solr.common.SolrException log *SEVERE: org.apache.solr.common.SolrException: Invalid Date String:''* at org.apache.solr.schema.DateField.parseMath(DateField.java:163) at org.apache.solr.schema.TrieDateField.createField(TrieDateField.java:171) at org.apache.solr.schema.SchemaField.createField(SchemaField.java:94) at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:246) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:619) Somebody could help me? Tks!! -- Claudio Devecchi
RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
Just checking - Did you verify the data was being extracted with the DIH UI? http://localhost:8983/solr/admin/dataimport.jsp -Original Message- From: PeterKerk [mailto:vettepa...@hotmail.com] Sent: Tuesday, August 03, 2010 11:28 AM To: solr-user@lucene.apache.org Subject: RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver' Ok, that works... But it only confirms that the extra data to be indexed I have defined in my data-config does not seem to be indexed. What can I do to debug this? Because nothing special is logged in my cygwin window! I have even simplified the data-config further to this: dataConfig dataSource driver=com.microsoft.sqlserver.jdbc.SQLServerDriver url=jdbc:sqlserver://localhost:1433;databaseName=wedding user=sa password=123456 / document name=weddinglocations entity name=location query=select * from locations field column=ID name=id / field column=TITLE name=title / field column=CITY name=city / entity name=location_theme query=select themeid from location_themes where locationid='${location.ID}' entity name=theme query=select title from themes where id = '${location_theme.themeid}' field name=cat column=title / /entity /entity /entity /document /dataConfig I have attached a screenshot of my datamodel for this part. http://lucene.472066.n3.nabble.com/file/n1019514/DBmodel.png DBmodel.png More help is greatly appreciated! :) -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-data-on-MSSQL-failed-Caused-by-org-apache-solr-common-SolrException-Error-loading-class-com-tp1015137p1019514.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
I had a look at this URL: http://localhost:8983/solr/db/admin/dataimport.jsp?handler=/dataimport There I see the data-config.xml on the left side and the full data import result on the right. The response is: response − lst name=responseHeader int name=status0/int int name=QTime234/int /lst − lst name=initArgs − lst name=defaults str name=configwedding-data-config.xml/str /lst /lst str name=commandfull-import/str str name=modedebug/str − arr name=documents − lst − arr name=id int1/int /arr − arr name=title strGemeentehuis Nijmegen/str /arr /lst − lst − arr name=id int2/int /arr − arr name=title strGemeentehuis Utrecht/str /arr /lst − lst − arr name=id int3/int /arr − arr name=title strBeachclub Vroeger/str /arr /lst /arr lst name=verbose-output/ str name=statusidle/str str name=importResponseConfiguration Re-loaded sucessfully/str − lst name=statusMessages str name=Total Requests made to DataSource7/str str name=Total Rows Fetched3/str str name=Total Documents Skipped0/str str name=Full Dump Started2010-08-03 18:13:15/str − str name= Indexing completed. Added/Updated: 3 documents. Deleted 0 documents. /str str name=Total Documents Processed3/str str name=Time taken 0:0:0.219/str /lst − str name=WARNING This response format is experimental. It is likely to change in the future. /str /response So there again, I dont see the themes or features back. Anything I can check here for you? -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-data-on-MSSQL-failed-Caused-by-org-apache-solr-common-SolrException-Error-loading-class-com-tp1015137p1019666.html Sent from the Solr - User mailing list archive at Nabble.com.
Multi word synomyms
I'm having trouble getting multi word synonyms to work. As an example I have the following synonym; exercise dvds = fitness When I search for exercise dvds I want to return all docs in the index which contain the keyword fitness. I've read the wiki about solr.SynonymFilterFactory which recommends expanding the synonym when indexing, but I'm not sure this is what I want as none of my documents have the keywords exercise dvds. Here is the field definition from my schema.xml; When I test my search with the analysis page on the admin console it seems to work fine; Query Analyzer org.apache.solr.analysis.WhitespaceTokenizerFactory {} term position 1 2 term text exercisedvds term type wordword source start,end 0,8 9,13 payload org.apache.solr.analysis.SynonymFilterFactory {ignoreCase=true, synonyms=synonyms.txt, expand=true} term position 1 term text fitness term type word source start,end 0,13 payload org.apache.solr.analysis.TrimFilterFactory {} term position 1 term text fitness term type word source start,end 0,13 payload org.apache.solr.analysis.StopFilterFactory {ignoreCase=true, enablePositionIncrements=true, words=stopwords.txt} term position 1 term text fitness term type word source start,end 0,13 payload org.apache.solr.analysis.LowerCaseFilterFactory {} term position 1 term text fitness term type word source start,end 0,13 payload org.apache.solr.analysis.SnowballPorterFilterFactory {language=English, protected=protwords.txt} term position 1 term text fit term type word source start,end 0,13 payload ...but when I perform the search it doesn't seem to use the SynonymFilterFactory; 0 0 exercise dvds 0 on standard 2.2 standard on *,score 10 . exercise dvds exercise dvds PRODUCTKEYWORDS:exercis PRODUCTKEYWORDS:dvds PRODUCTKEYWORDS:exercis PRODUCTKEYWORDS:dvds -- View this message in context: http://lucene.472066.n3.nabble.com/Multi-word-synomyms-tp1019722p1019722.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
Hmm. I bet your location query is all wrong. Take this: query=select themeid from location_themes where locationid='${location.ID}' I'm pretty sure that locationid is not a string (since it's extracted as an int below), which means your SQL query will be trying to match an int against a string. Take the single quotes out, thus: query=select themeid from location_themes where locationid=${location.ID} ...and it should work. Either way, you should test it in SQL Server Management Studio. -Original Message- From: PeterKerk [mailto:vettepa...@hotmail.com] Sent: Tuesday, August 03, 2010 12:16 PM To: solr-user@lucene.apache.org Subject: RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver' I had a look at this URL: http://localhost:8983/solr/db/admin/dataimport.jsp?handler=/dataimport There I see the data-config.xml on the left side and the full data import result on the right. The response is: response − lst name=responseHeader int name=status0/int int name=QTime234/int /lst − lst name=initArgs − lst name=defaults str name=configwedding-data-config.xml/str /lst /lst str name=commandfull-import/str str name=modedebug/str − arr name=documents − lst − arr name=id int1/int /arr − arr name=title strGemeentehuis Nijmegen/str /arr /lst − lst − arr name=id int2/int /arr − arr name=title strGemeentehuis Utrecht/str /arr /lst − lst − arr name=id int3/int /arr − arr name=title strBeachclub Vroeger/str /arr /lst /arr lst name=verbose-output/ str name=statusidle/str str name=importResponseConfiguration Re-loaded sucessfully/str − lst name=statusMessages str name=Total Requests made to DataSource7/str str name=Total Rows Fetched3/str str name=Total Documents Skipped0/str str name=Full Dump Started2010-08-03 18:13:15/str − str name= Indexing completed. Added/Updated: 3 documents. Deleted 0 documents. /str str name=Total Documents Processed3/str str name=Time taken 0:0:0.219/str /lst − str name=WARNING This response format is experimental. It is likely to change in the future. /str /response So there again, I dont see the themes or features back. Anything I can check here for you? -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-data-on-MSSQL-failed-Caused-by-org-apache-solr-common-SolrException-Error-loading-class-com-tp1015137p1019666.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Queries with multiple wildcards failing in branch3x
Just reporting back, no issues on the latest branch3x build with your revert of the optimization. --Paul On Tue, Aug 3, 2010 at 9:22 AM, Paul Dlug paul.d...@gmail.com wrote: Sure, I'm reindexing now, I'll let you know how it goes. --Paul On Tue, Aug 3, 2010 at 9:05 AM, Michael McCandless luc...@mikemccandless.com wrote: Ugh... I think there may still be a bug lurking. Karl is also still having problems, much further into his indexing process. I'm hunting it now!! For the time being, I just disabled (committed to trunk 3x) the optimization that's causing the bug. Can you update to 3x head (or trunk head), remove your current index, and try again? Mike On Tue, Aug 3, 2010 at 8:52 AM, Paul Dlug paul.d...@gmail.com wrote: Thanks, I updated to the latest version with the fix but I'm now getting another error when optimizing the index (or when searching certain fields). It mentions unknown compression method but I'm not using compressed fields at all. SEVERE: java.io.IOException: background merge hit exception: _a:C248670/19645 _l:C206701/14563 _m:C12186/100 _n:C11356 _o:C9945 _p:C9000 _q:C5704 _r:C2214 _s:C2000 _t:C1264 into _u [optimize] [mergeDocStores] at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2392) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2320) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:403) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:169) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.lucene.index.CorruptIndexException: field data are in wrong format: java.util.zip.DataFormatException: unknown compression method at org.apache.lucene.index.FieldsReader.uncompress(FieldsReader.java:585) at org.apache.lucene.index.FieldsReader.addField(FieldsReader.java:357) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:239) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:894) at org.apache.lucene.index.IndexReader.document(IndexReader.java:684) at org.apache.lucene.index.SegmentMerger.copyFieldsWithDeletions(SegmentMerger.java:410) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:338) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:159) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4053) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3647) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:339) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:407) Caused by: java.util.zip.DataFormatException: unknown compression method at java.util.zip.Inflater.inflateBytes(Native Method) at java.util.zip.Inflater.inflate(Inflater.java:238) at java.util.zip.Inflater.inflate(Inflater.java:256) at org.apache.lucene.document.CompressionTools.decompress(CompressionTools.java:106) at org.apache.lucene.index.FieldsReader.uncompress(FieldsReader.java:582) ... 11 more On Mon, Aug 2, 2010 at 6:04 PM,
RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
Hi, You are correct that locationid is an integer. I have changed it to: entity name=location_theme query=select themeid from location_themes where locationid=${location.ID} But then I get the error: Incorrect syntax near '=' Even though that statement does work in mgmt studio SEVERE: Full Import failed org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: select themeid from location_themes where locationid= Processing Documen t # 1 at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAnd Throw(DataImportHandlerException.java:72) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator. init(JdbcDataSource.java:253) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSou rce.java:210) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSou rce.java:39) at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEn tityProcessor.java:58) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEnti tyProcessor.java:71) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(Ent ityProcessorWrapper.java:237) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde r.java:357) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde r.java:383) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.j ava:242) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java :180) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImpo rter.java:331) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.j ava:389) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.ja va:370) Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Incorrect syntax nea r '='. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError (SQLServerException.java:197) at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServ erStatement.java:1493) at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQ LServerStatement.java:775) at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute (SQLServerStatement.java:676) at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4575) at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLSe rverConnection.java:1400) at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLSer verStatement.java:179) at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLS erverStatement.java:154) at com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStat ement.java:649) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator. init(JdbcDataSource.java:246) ... 12 more Aug 3, 2010 6:44:16 PM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: start rollback Aug 3, 2010 6:44:16 PM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: end_rollback -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-data-on-MSSQL-failed-Caused-by-org-apache-solr-common-SolrException-Error-loading-class-com-tp1015137p1019753.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Multi word synomyms
Hi, This happens because your tokenizer will generate seperate tokens for `exercise dvds`, so the SynonymFilter will try to find declared synonyms for `exercise` and `dvds` separately. It's behavior is documented [1] on the wiki. [1]: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory Cheers, -Original message- From: Qwerky neil.j.tay...@hmv.co.uk Sent: Tue 03-08-2010 18:35 To: solr-user@lucene.apache.org; Subject: Multi word synomyms I'm having trouble getting multi word synonyms to work. As an example I have the following synonym; exercise dvds = fitness When I search for exercise dvds I want to return all docs in the index which contain the keyword fitness. I've read the wiki about solr.SynonymFilterFactory which recommends expanding the synonym when indexing, but I'm not sure this is what I want as none of my documents have the keywords exercise dvds. Here is the field definition from my schema.xml; When I test my search with the analysis page on the admin console it seems to work fine; Query Analyzer org.apache.solr.analysis.WhitespaceTokenizerFactory {} term position 12 term text exercisedvds term type wordword source start,end 0,89,13 payload org.apache.solr.analysis.SynonymFilterFactory {ignoreCase=true, synonyms=synonyms.txt, expand=true} term position 1 term text fitness term type word source start,end 0,13 payload org.apache.solr.analysis.TrimFilterFactory {} term position 1 term text fitness term type word source start,end 0,13 payload org.apache.solr.analysis.StopFilterFactory {ignoreCase=true, enablePositionIncrements=true, words=stopwords.txt} term position 1 term text fitness term type word source start,end 0,13 payload org.apache.solr.analysis.LowerCaseFilterFactory {} term position 1 term text fitness term type word source start,end 0,13 payload org.apache.solr.analysis.SnowballPorterFilterFactory {language=English, protected=protwords.txt} term position 1 term text fit term type word source start,end 0,13 payload ...but when I perform the search it doesn't seem to use the SynonymFilterFactory; 0 0 exercise dvds 0 on standard 2.2 standard on *,score 10 . exercise dvds exercise dvds PRODUCTKEYWORDS:exercis PRODUCTKEYWORDS:dvds PRODUCTKEYWORDS:exercis PRODUCTKEYWORDS:dvds -- View this message in context: http://lucene.472066.n3.nabble.com/Multi-word-synomyms-tp1019722p1019722.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Multi word synomyms
Unfortunately, Lucene's QueryParser pre-splits all incoming text on whitespace, which means your search-time analyzer never has a chance to detect the multi-word synonym. Ie, your analyzer is invoked twice. Once with exercise and once with dvds. We need to fix that... but it's not exactly clear how. The QueryParser/Analyzer interaction is tricky. Mike On Tue, Aug 3, 2010 at 12:34 PM, Qwerky neil.j.tay...@hmv.co.uk wrote: I'm having trouble getting multi word synonyms to work. As an example I have the following synonym; exercise dvds = fitness When I search for exercise dvds I want to return all docs in the index which contain the keyword fitness. I've read the wiki about solr.SynonymFilterFactory which recommends expanding the synonym when indexing, but I'm not sure this is what I want as none of my documents have the keywords exercise dvds. Here is the field definition from my schema.xml; When I test my search with the analysis page on the admin console it seems to work fine; Query Analyzer org.apache.solr.analysis.WhitespaceTokenizerFactory {} term position 1 2 term text exercise dvds term type word word source start,end 0,8 9,13 payload org.apache.solr.analysis.SynonymFilterFactory {ignoreCase=true, synonyms=synonyms.txt, expand=true} term position 1 term text fitness term type word source start,end 0,13 payload org.apache.solr.analysis.TrimFilterFactory {} term position 1 term text fitness term type word source start,end 0,13 payload org.apache.solr.analysis.StopFilterFactory {ignoreCase=true, enablePositionIncrements=true, words=stopwords.txt} term position 1 term text fitness term type word source start,end 0,13 payload org.apache.solr.analysis.LowerCaseFilterFactory {} term position 1 term text fitness term type word source start,end 0,13 payload org.apache.solr.analysis.SnowballPorterFilterFactory {language=English, protected=protwords.txt} term position 1 term text fit term type word source start,end 0,13 payload ...but when I perform the search it doesn't seem to use the SynonymFilterFactory; 0 0 exercise dvds 0 on standard 2.2 standard on *,score 10 . exercise dvds exercise dvds PRODUCTKEYWORDS:exercis PRODUCTKEYWORDS:dvds PRODUCTKEYWORDS:exercis PRODUCTKEYWORDS:dvds -- View this message in context: http://lucene.472066.n3.nabble.com/Multi-word-synomyms-tp1019722p1019722.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
Looks like {$location.ID} isn't being pull out correctly. I'd suggest playing around (e.g. with capitalization). Still, I can't say I know why it's failing. -Original Message- From: PeterKerk [mailto:vettepa...@hotmail.com] Sent: Tuesday, August 03, 2010 12:48 PM To: solr-user@lucene.apache.org Subject: RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver' Hi, You are correct that locationid is an integer. I have changed it to: entity name=location_theme query=select themeid from location_themes where locationid=${location.ID} But then I get the error: Incorrect syntax near '=' Even though that statement does work in mgmt studio SEVERE: Full Import failed org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: select themeid from location_themes where locationid= Processing Documen t # 1 at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAnd Throw(DataImportHandlerException.java:72) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator. init(JdbcDataSource.java:253) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSou rce.java:210) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSou rce.java:39) at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEn tityProcessor.java:58) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEnti tyProcessor.java:71) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(Ent ityProcessorWrapper.java:237) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde r.java:357) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilde r.java:383) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.j ava:242) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java :180) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImpo rter.java:331) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.j ava:389) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.ja va:370) Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Incorrect syntax nea r '='. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError (SQLServerException.java:197) at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServ erStatement.java:1493) at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQ LServerStatement.java:775) at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute (SQLServerStatement.java:676) at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:4575) at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLSe rverConnection.java:1400) at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLSer verStatement.java:179) at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLS erverStatement.java:154) at com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStat ement.java:649) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator. init(JdbcDataSource.java:246) ... 12 more Aug 3, 2010 6:44:16 PM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: start rollback Aug 3, 2010 6:44:16 PM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: end_rollback -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-data-on-MSSQL-failed-Caused-by-org-apache-solr-common-SolrException-Error-loading-class-com-tp1015137p1019753.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Error indexing date
Hi Michael, My xml have it: field name=signupdate2010-07-31T13:37:35.999Z/field And in my schema.xml I have it: field name=signupdate type=date indexed=true stored=true required=true / Tks! On Tue, Aug 3, 2010 at 12:19 PM, Michael Griffiths mgriffi...@am-ind.comwrote: What's your XML data look like (for the data)? Looks like it's not the same date format Solr accepts. -Original Message- From: Claudio Devecchi [mailto:cdevec...@gmail.com] Sent: Tuesday, August 03, 2010 11:16 AM To: solr-user@lucene.apache.org Subject: Error indexing date Hi everybody, I'm having a error to index a xml file. I have configured my schema xml with all field names I needed, and created a xml with these fields, but in field wich I use some date I receive the following error: On my schema.xml the fields that I need date I configured it with * type=date* On my xml file the date is on this format:* 2010-07-31T13:37:35:999Z* Jul 28, 2010 11:40:44 PM org.apache.solr.common.SolrException log *SEVERE: org.apache.solr.common.SolrException: Invalid Date String:''* at org.apache.solr.schema.DateField.parseMath(DateField.java:163) at org.apache.solr.schema.TrieDateField.createField(TrieDateField.java:171) at org.apache.solr.schema.SchemaField.createField(SchemaField.java:94) at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:246) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:139) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:619) Somebody could help me? Tks!! -- Claudio Devecchi -- Claudio Devecchi flickr.com/cdevecchi
RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
Awesome! You did it! Turned out, I had to change the casing of this line: entity name=location_feature query=select featureid from location_features where locationid='${location.ID}' to: entity name=location_feature query=select featureid from location_features where locationid='${location.id}' Now the resultset is like this: response − lst name=responseHeader int name=status0/int int name=QTime16/int − lst name=params str name=indenton/str str name=start0/str str name=q*:*/str str name=version2.2/str str name=rows10/str /lst /lst − result name=response numFound=3 start=0 − doc − arr name=cat strGemeentehuis/str /arr − arr name=features strTuin/str strCafe/str /arr str name=id1/str date name=timestamp2010-08-03T17:31:21.562Z/date str name=titleGemeentehuis Nijmegen/str /doc − doc − arr name=cat strGemeentehuis/str /arr − arr name=features strTuin/str strCafe/str strDanszaal/str /arr str name=id2/str date name=timestamp2010-08-03T17:31:21.593Z/date str name=titleGemeentehuis Utrecht/str /doc − doc − arr name=cat strStrand Zee/str /arr − arr name=features strStrand/str strCafe/str strDanszaal/str /arr str name=id3/str date name=timestamp2010-08-03T17:31:21.609Z/date str name=titleBeachclub Vroeger/str /doc /result /response And now the FINAL question I have: as you can see above features is a facet. What should I type in the solr admin page on http://localhost:8983/solr/db/admin/ to filter all records on the features facet? e.g. to show all records that have a feature Cafe? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-data-on-MSSQL-failed-Caused-by-org-apache-solr-common-SolrException-Error-loading-class-com-tp1015137p1020051.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Error indexing date
: On my xml file the date is on this format:* 2010-07-31T13:37:35:999Z* : Jul 28, 2010 11:40:44 PM org.apache.solr.common.SolrException log : *SEVERE: org.apache.solr.common.SolrException: Invalid Date String:''* According to that error message, you are attempting to index a date field with a value that is an empty string, Ie... field name=signupdate/field ...or perhaps... field name=signupdate / (xml makes no distinction) -Hoss
Re: Error indexing date
My field is not empty, I have the date on the field and the error happens, the type to put on schema.xml is date? or have any other? On Tue, Aug 3, 2010 at 2:38 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : On my xml file the date is on this format:* 2010-07-31T13:37:35:999Z* : Jul 28, 2010 11:40:44 PM org.apache.solr.common.SolrException log : *SEVERE: org.apache.solr.common.SolrException: Invalid Date String:''* According to that error message, you are attempting to index a date field with a value that is an empty string, Ie... field name=signupdate/field ...or perhaps... field name=signupdate / (xml makes no distinction) -Hoss -- Claudio Devecchi flickr.com/cdevecchi
RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver'
Well, first of all I'd suggest installing Velocity, and using that as your test querying interface... But try ?fq=feature:Cafe -Original Message- From: PeterKerk [mailto:vettepa...@hotmail.com] Sent: Tuesday, August 03, 2010 1:38 PM To: solr-user@lucene.apache.org Subject: RE: Indexing data on MSSQL failed: Caused by: org.apache.solr.common.SolrException: Error loading class 'com.micros oft.sqlserver.jdbc.SQLServerDriver' Awesome! You did it! Turned out, I had to change the casing of this line: entity name=location_feature query=select featureid from location_features where locationid='${location.ID}' to: entity name=location_feature query=select featureid from location_features where locationid='${location.id}' Now the resultset is like this: response − lst name=responseHeader int name=status0/int int name=QTime16/int − lst name=params str name=indenton/str str name=start0/str str name=q*:*/str str name=version2.2/str str name=rows10/str /lst /lst − result name=response numFound=3 start=0 − doc − arr name=cat strGemeentehuis/str /arr − arr name=features strTuin/str strCafe/str /arr str name=id1/str date name=timestamp2010-08-03T17:31:21.562Z/date str name=titleGemeentehuis Nijmegen/str /doc − doc − arr name=cat strGemeentehuis/str /arr − arr name=features strTuin/str strCafe/str strDanszaal/str /arr str name=id2/str date name=timestamp2010-08-03T17:31:21.593Z/date str name=titleGemeentehuis Utrecht/str /doc − doc − arr name=cat strStrand Zee/str /arr − arr name=features strStrand/str strCafe/str strDanszaal/str /arr str name=id3/str date name=timestamp2010-08-03T17:31:21.609Z/date str name=titleBeachclub Vroeger/str /doc /result /response And now the FINAL question I have: as you can see above features is a facet. What should I type in the solr admin page on http://localhost:8983/solr/db/admin/ to filter all records on the features facet? e.g. to show all records that have a feature Cafe? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Indexing-data-on-MSSQL-failed-Caused-by-org-apache-solr-common-SolrException-Error-loading-class-com-tp1015137p1020051.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Scoring on multi-valued fields
I checked the explain query. What happens is that the sums of all the hits on ID are added up. Is there a way to only grab the first score? Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Scoring-on-multi-valued-fields-tp1017624p1020150.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Scoring on multi-valued fields
Oh sorry guys, I didn't correctly submit my original post to the mailing list. The original message was this: Hello all. We are having some trouble with queries similar to the type shown below: name: pizza OR (id:10 OR id:20 OR id:30) (id is a multi-valued field) With the above query, we will always get documents with pizza in the name, and any document with id values of 10, 20, and 30 will always come up first. What we would like is to have a document with only id 10 to be weighted the same as a document with ids 10, 20, and 30. Is this possible with Lucene/Solr? Thanks in advance for any assistance you might be able to offer. -- View this message in context: http://lucene.472066.n3.nabble.com/Scoring-on-multi-valued-fields-tp1017624p1020181.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Scoring on multi-valued fields
On Tue, Aug 3, 2010 at 2:42 PM, oleg.gnatovskiy crooke...@gmail.com wrote: Oh sorry guys, I didn't correctly submit my original post to the mailing list. The original message was this: Hello all. We are having some trouble with queries similar to the type shown below: name: pizza OR (id:10 OR id:20 OR id:30) (id is a multi-valued field) With the above query, we will always get documents with pizza in the name, and any document with id values of 10, 20, and 30 will always come up first. What we would like is to have a document with only id 10 to be weighted the same as a document with ids 10, 20, and 30. How do you want pizza weighted against 10, 20, or 30? If pizza can always come first, you can boost the second clause to zero: pizza OR (id:10 OR id:20 OR id:30)^0 What happens is that the sums of all the hits on ID are added up. Is there a way to only grab the first score? There is a way to grab only the highest score from a set of options (DisjunctionMaxQuery) but unfortunately there is no general query parser syntax to support that yet. -Yonik http://www.lucidimagination.com
Re: Scoring on multi-valued fields
Sorry guess I messed up my example query. The query should look like this: name:pizza AND id:(10 OR 20 OR 30) Thus if I do name:pizza^10 AND id:(10 OR 20 OR 30)^0 wouldn't a document that has all the ids (10,20, and 30) still come up higher then a document that has just one? -- View this message in context: http://lucene.472066.n3.nabble.com/Scoring-on-multi-valued-fields-tp1017624p1020234.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Scoring on multi-valued fields
On Tue, Aug 3, 2010 at 3:16 PM, oleg.gnatovskiy crooke...@gmail.com wrote: Sorry guess I messed up my example query. The query should look like this: name:pizza AND id:(10 OR 20 OR 30) Thus if I do name:pizza^10 AND id:(10 OR 20 OR 30)^0 wouldn't a document that has all the ids (10,20, and 30) still come up higher then a document that has just one? No, because the whole id:(10 OR 20 OR 30)^0 clause will contribute 0 to the final score. Another way to get the same effect would be to pull it out as a filter: q=name:pizzafq=id:(10 OR 20 OR 30) -Yonik http://www.lucidimagination.com
Re: Error indexing date
Somebody have an idea? My fields are not null and solr apparently thinks that they are On Tue, Aug 3, 2010 at 2:55 PM, Claudio Devecchi cdevec...@gmail.comwrote: My field is not empty, I have the date on the field and the error happens, the type to put on schema.xml is date? or have any other? On Tue, Aug 3, 2010 at 2:38 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : On my xml file the date is on this format:* 2010-07-31T13:37:35:999Z* : Jul 28, 2010 11:40:44 PM org.apache.solr.common.SolrException log : *SEVERE: org.apache.solr.common.SolrException: Invalid Date String:''* According to that error message, you are attempting to index a date field with a value that is an empty string, Ie... field name=signupdate/field ...or perhaps... field name=signupdate / (xml makes no distinction) -Hoss -- Claudio Devecchi flickr.com/cdevecchi -- Claudio Devecchi flickr.com/cdevecchi
RE: Error indexing date
I'd guess the DIH is not extracting the date correctly. Either way, Solr is not retrieving the date. -Original Message- From: Claudio Devecchi [mailto:cdevec...@gmail.com] Sent: Tuesday, August 03, 2010 3:45 PM To: solr-user@lucene.apache.org Subject: Re: Error indexing date Somebody have an idea? My fields are not null and solr apparently thinks that they are On Tue, Aug 3, 2010 at 2:55 PM, Claudio Devecchi cdevec...@gmail.comwrote: My field is not empty, I have the date on the field and the error happens, the type to put on schema.xml is date? or have any other? On Tue, Aug 3, 2010 at 2:38 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : On my xml file the date is on this format:* 2010-07-31T13:37:35:999Z* : Jul 28, 2010 11:40:44 PM org.apache.solr.common.SolrException log : *SEVERE: org.apache.solr.common.SolrException: Invalid Date String:''* According to that error message, you are attempting to index a date field with a value that is an empty string, Ie... field name=signupdate/field ...or perhaps... field name=signupdate / (xml makes no distinction) -Hoss -- Claudio Devecchi flickr.com/cdevecchi -- Claudio Devecchi flickr.com/cdevecchi
Re: Error indexing date
That is because it is an illegal ISO 8601 datetime. The seconds portion should be 35.999, not 35:999. wunder On Aug 3, 2010, at 12:55 PM, Michael Griffiths wrote: I'd guess the DIH is not extracting the date correctly. Either way, Solr is not retrieving the date. -Original Message- From: Claudio Devecchi [mailto:cdevec...@gmail.com] Sent: Tuesday, August 03, 2010 3:45 PM To: solr-user@lucene.apache.org Subject: Re: Error indexing date Somebody have an idea? My fields are not null and solr apparently thinks that they are On Tue, Aug 3, 2010 at 2:55 PM, Claudio Devecchi cdevec...@gmail.comwrote: My field is not empty, I have the date on the field and the error happens, the type to put on schema.xml is date? or have any other? On Tue, Aug 3, 2010 at 2:38 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : On my xml file the date is on this format:* 2010-07-31T13:37:35:999Z* : Jul 28, 2010 11:40:44 PM org.apache.solr.common.SolrException log : *SEVERE: org.apache.solr.common.SolrException: Invalid Date String:''* According to that error message, you are attempting to index a date field with a value that is an empty string, Ie... field name=signupdate/field ...or perhaps... field name=signupdate / (xml makes no distinction) -Hoss -- Claudio Devecchi flickr.com/cdevecchi -- Claudio Devecchi flickr.com/cdevecchi
Re: Error indexing date
Hi guys... I already changed from : to . and nothing happens, the problem is the solr ignores the content of the date, but if it is an incorrecty format, solr shows this tks On Tue, Aug 3, 2010 at 5:05 PM, Walter Underwood wun...@wunderwood.orgwrote: That is because it is an illegal ISO 8601 datetime. The seconds portion should be 35.999, not 35:999. wunder On Aug 3, 2010, at 12:55 PM, Michael Griffiths wrote: I'd guess the DIH is not extracting the date correctly. Either way, Solr is not retrieving the date. -Original Message- From: Claudio Devecchi [mailto:cdevec...@gmail.com] Sent: Tuesday, August 03, 2010 3:45 PM To: solr-user@lucene.apache.org Subject: Re: Error indexing date Somebody have an idea? My fields are not null and solr apparently thinks that they are On Tue, Aug 3, 2010 at 2:55 PM, Claudio Devecchi cdevec...@gmail.com wrote: My field is not empty, I have the date on the field and the error happens, the type to put on schema.xml is date? or have any other? On Tue, Aug 3, 2010 at 2:38 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : On my xml file the date is on this format:* 2010-07-31T13:37:35:999Z* : Jul 28, 2010 11:40:44 PM org.apache.solr.common.SolrException log : *SEVERE: org.apache.solr.common.SolrException: Invalid Date String:''* According to that error message, you are attempting to index a date field with a value that is an empty string, Ie... field name=signupdate/field ...or perhaps... field name=signupdate / (xml makes no distinction) -Hoss -- Claudio Devecchi flickr.com/cdevecchi -- Claudio Devecchi flickr.com/cdevecchi -- Claudio Devecchi flickr.com/cdevecchi
analysis tool vs. reality
Hello, I have found the analysis tool in the admin page to be very useful in understanding my schema. I've made changes to my schema so that a particular case I'm looking at matches properly. I restarted solr, deleted the document from the index, and added it again. But still, when I do a query, the document does not get returned in the results. Does anyone have any tips for debugging this sort of issue? What is different between what I see in analysis tool and new documents added to the index? Thanks, Justin
Re: analysis tool vs. reality
The analysis tool is merely that, but during querying there is also a query parser involved. Adding debugQuery=true to your request will give you the parsed query in the response offering insight into what might be going on. Could be lots of things, like not querying the fields you think you are to a misunderstanding about some text not being analyzed (like wildcard clauses). Erik On Aug 3, 2010, at 4:43 PM, Justin Lolofie wrote: Hello, I have found the analysis tool in the admin page to be very useful in understanding my schema. I've made changes to my schema so that a particular case I'm looking at matches properly. I restarted solr, deleted the document from the index, and added it again. But still, when I do a query, the document does not get returned in the results. Does anyone have any tips for debugging this sort of issue? What is different between what I see in analysis tool and new documents added to the index? Thanks, Justin
Re: Scoring on multi-valued fields
Well that does take care of some cases. How about if we still want a hit on a tag to contribute to the weight though? There would be 2 options. One is the one I described in the original post, which is to grab the highest score of a set of ids. The other would be to somehow control the scores of each id. So a document with 2 ids matching should be worth more then the document with only 1 id matching (This is how it works now) but a document with 7 ids matching shouldn't be worth more, or at least not a lot more, then a document that matches only 3 ids (this is not how it works). The reason this would be ideal for us is that we don't have any control over how many ids will be in the query and we don't want documents that have lots of ids to have an unnatural advantage over those with just a few. -- View this message in context: http://lucene.472066.n3.nabble.com/Scoring-on-multi-valued-fields-tp1017624p1020504.html Sent from the Solr - User mailing list archive at Nabble.com.
Using DateMath + range queries + Long doesn't work
Hi, I use Nutch and Solr to crawl a few thousand sites. I would like to limit my queries to recently changed documents. I use Nutch' index-more plugin which stores the Last-Modified HTTP response header in the index as a Long value. I would like to use a query like this to limit the results to the pages that were changed since yesterday: lastModified:[ms(NOW/DAY-1DAY) TO ms()] AND ... regular query ... This however doesn't work. If I use the following: lastModified:[128081160 TO 128089800] AND ... regular query ... or lastModified:[128081160 TO ms()] AND ... regular query ... I do get results. I use Solr 1.4.1. Any ideas what's wrong or why this doesn't work? Thanks, Jeroen
Duplicate a core
Is it possible to duplicate a core? I want to have one core contain only documents within a certain date range (ex: 3 days old), and one core with all documents that have ever been in the first core. The small core is then replicated to other servers which do real-time processing on it, but the archive core exists for longer term searching. I understand I could just connect to both cores from my indexer, but I would like to not have to send duplicate documents across the network to save bandwidth. Is this possible? Thanks.
Re: Error indexing date
Can we see the rest of your document and your schema? Lots of people index dates, so my first guess is that some innocent change and/or typo is causing your problems, but there's no way to check without you posting the complete information. Best Erick On Tue, Aug 3, 2010 at 4:31 PM, Claudio Devecchi cdevec...@gmail.comwrote: Hi guys... I already changed from : to . and nothing happens, the problem is the solr ignores the content of the date, but if it is an incorrecty format, solr shows this tks On Tue, Aug 3, 2010 at 5:05 PM, Walter Underwood wun...@wunderwood.org wrote: That is because it is an illegal ISO 8601 datetime. The seconds portion should be 35.999, not 35:999. wunder On Aug 3, 2010, at 12:55 PM, Michael Griffiths wrote: I'd guess the DIH is not extracting the date correctly. Either way, Solr is not retrieving the date. -Original Message- From: Claudio Devecchi [mailto:cdevec...@gmail.com] Sent: Tuesday, August 03, 2010 3:45 PM To: solr-user@lucene.apache.org Subject: Re: Error indexing date Somebody have an idea? My fields are not null and solr apparently thinks that they are On Tue, Aug 3, 2010 at 2:55 PM, Claudio Devecchi cdevec...@gmail.com wrote: My field is not empty, I have the date on the field and the error happens, the type to put on schema.xml is date? or have any other? On Tue, Aug 3, 2010 at 2:38 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : On my xml file the date is on this format:* 2010-07-31T13:37:35:999Z* : Jul 28, 2010 11:40:44 PM org.apache.solr.common.SolrException log : *SEVERE: org.apache.solr.common.SolrException: Invalid Date String:''* According to that error message, you are attempting to index a date field with a value that is an empty string, Ie... field name=signupdate/field ...or perhaps... field name=signupdate / (xml makes no distinction) -Hoss -- Claudio Devecchi flickr.com/cdevecchi -- Claudio Devecchi flickr.com/cdevecchi -- Claudio Devecchi flickr.com/cdevecchi
Re: Error indexing date
: The files are attached. From the files you sent... field name=lastsignindate/field ...as i said before... : According to that error message, you are attempting to index a date : field with a value that is an empty string, : : Ie... : : field name=signupdate/field : : ...or perhaps... : : field name=signupdate / : : (xml makes no distinction) -Hoss
Re: Error indexing date
yep... do you know how can I do with my schema to some fields dont be necessary? because in some cases it will be null and sometimes not.. but very thanks, now indexed ok with no errors tks On Tue, Aug 3, 2010 at 8:05 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : The files are attached. From the files you sent... field name=lastsignindate/field ...as i said before... : According to that error message, you are attempting to index a date : field with a value that is an empty string, : : Ie... : : field name=signupdate/field : : ...or perhaps... : : field name=signupdate / : : (xml makes no distinction) -Hoss -- Claudio Devecchi flickr.com/cdevecchi
analysis tool vs. reality
Hi Erik, thank you for replying. So, turning on debugQuery shows information about how the query is processed- is there a way to see how things are stored internally in the index? My query is ABC12. There is a document who's title field is ABC12. However, I can only get it to match if I search for ABC or 12. This was also true in the analysis tool up until recently. However, I changed schema.xml and turned on catenate-all in WordDelimterFilterFactory for title fieldtype. Now, in the analysis tool ABC12 matches ABC12. However, when doing an actual query, it does not match. Thank you for any help, Justin -- Forwarded message -- From: Erik Hatcher erik.hatc...@gmail.com To: solr-user@lucene.apache.org Date: Tue, 3 Aug 2010 16:50:06 -0400 Subject: Re: analysis tool vs. reality The analysis tool is merely that, but during querying there is also a query parser involved. Adding debugQuery=true to your request will give you the parsed query in the response offering insight into what might be going on. Could be lots of things, like not querying the fields you think you are to a misunderstanding about some text not being analyzed (like wildcard clauses). Erik On Aug 3, 2010, at 4:43 PM, Justin Lolofie wrote: Hello, I have found the analysis tool in the admin page to be very useful in understanding my schema. I've made changes to my schema so that a particular case I'm looking at matches properly. I restarted solr, deleted the document from the index, and added it again. But still, when I do a query, the document does not get returned in the results. Does anyone have any tips for debugging this sort of issue? What is different between what I see in analysis tool and new documents added to the index? Thanks, Justin
MultiCore SWAP and Replication
I'm using Solr Java replication with multiple master cores (at_bat on_deck), and a single slave core (at_bat) The at_bat cores of the master and slave are used for processing search requests, and the on_deck core is used for complete index rebuilds. Once a rebuild is complete, the at_bat core is SWAPped with the on_deck core. The single slave core is configured to poll the master at_bat core. When the swap occurs, the slave detects the version change, and then the entire set of index files is replicated to a new index.MMDDhhmmss directory. Does anyone have advice on this approach; findings, issues encountered, possibly a way to work around the ever-growing copies of core data directories without having to use custom cleanup scripts? -Kelly
Sharing index files between multiple JVMs and replication
Is there a way to share index files amongst my multiple Solr web-apps, by configuring only one of the JVMs as an indexer, and the remaining, as read-only searchers? I'd like to configure in such a way that on startup of the read-only searchers, missing cores/indexes are not created, and updates are not handled. If I can get around the files being locked by the read-only instances, I should be able to scale wider in a given environment, as well as have less replicated copies of my master index (Solr 1.4 Java Replication). Then once the commit is issued to the slave, I can fire off a RELOAD script for each of my read-only cores. -Kelly
Re: analysis tool vs. reality
This is the 'index' part of the analyser.jsp page. You can ask how the text is indexed as well as how it is turned into a query. On Tue, Aug 3, 2010 at 4:35 PM, Justin Lolofie jta...@gmail.com wrote: Hi Erik, thank you for replying. So, turning on debugQuery shows information about how the query is processed- is there a way to see how things are stored internally in the index? My query is ABC12. There is a document who's title field is ABC12. However, I can only get it to match if I search for ABC or 12. This was also true in the analysis tool up until recently. However, I changed schema.xml and turned on catenate-all in WordDelimterFilterFactory for title fieldtype. Now, in the analysis tool ABC12 matches ABC12. However, when doing an actual query, it does not match. Thank you for any help, Justin -- Forwarded message -- From: Erik Hatcher erik.hatc...@gmail.com To: solr-user@lucene.apache.org Date: Tue, 3 Aug 2010 16:50:06 -0400 Subject: Re: analysis tool vs. reality The analysis tool is merely that, but during querying there is also a query parser involved. Adding debugQuery=true to your request will give you the parsed query in the response offering insight into what might be going on. Could be lots of things, like not querying the fields you think you are to a misunderstanding about some text not being analyzed (like wildcard clauses). Erik On Aug 3, 2010, at 4:43 PM, Justin Lolofie wrote: Hello, I have found the analysis tool in the admin page to be very useful in understanding my schema. I've made changes to my schema so that a particular case I'm looking at matches properly. I restarted solr, deleted the document from the index, and added it again. But still, when I do a query, the document does not get returned in the results. Does anyone have any tips for debugging this sort of issue? What is different between what I see in analysis tool and new documents added to the index? Thanks, Justin -- Lance Norskog goks...@gmail.com
Re: Sharing index files between multiple JVMs and replication
Are these files on a common file server? If you want to share them that way, it actually does work just to give them all the same index directory, as long as only one of them changes it. On Tue, Aug 3, 2010 at 4:38 PM, Kelly Taylor wired...@yahoo.com wrote: Is there a way to share index files amongst my multiple Solr web-apps, by configuring only one of the JVMs as an indexer, and the remaining, as read-only searchers? I'd like to configure in such a way that on startup of the read-only searchers, missing cores/indexes are not created, and updates are not handled. If I can get around the files being locked by the read-only instances, I should be able to scale wider in a given environment, as well as have less replicated copies of my master index (Solr 1.4 Java Replication). Then once the commit is issued to the slave, I can fire off a RELOAD script for each of my read-only cores. -Kelly -- Lance Norskog goks...@gmail.com
Best solution to avoiding multiple query requests
Hi all, I've got a situation where the key result from an initial search request (let's say for dog) is the list of values from a faceted field, sorted by hit count. For the top 10 of these faceted field values, I need to get the top hit for the target request (dog) restricted to that value for the faceted field. Currently this is 11 total requests, of which the 10 requests following the initial query can be made in parallel. But that's still a lot of requests. So my questions are: 1. Is there any magic query to handle this with Solr as-is? 2. if not, is the best solution to create my own request handler? 3. And in that case, any input/tips on developing this type of custom request handler? Thanks, -- Ken Ken Krugler +1 530-210-6378 http://bixolabs.com e l a s t i c w e b m i n i n g
Re: Queries with multiple wildcards failing in branch3x
OK thanks Paul. I just committed another (hopefully, last!) fix, so if you get a chance can you try that (just svn up)? Thanks, and, sorry, Mike On Tue, Aug 3, 2010 at 12:40 PM, Paul Dlug paul.d...@gmail.com wrote: Just reporting back, no issues on the latest branch3x build with your revert of the optimization. --Paul On Tue, Aug 3, 2010 at 9:22 AM, Paul Dlug paul.d...@gmail.com wrote: Sure, I'm reindexing now, I'll let you know how it goes. --Paul On Tue, Aug 3, 2010 at 9:05 AM, Michael McCandless luc...@mikemccandless.com wrote: Ugh... I think there may still be a bug lurking. Karl is also still having problems, much further into his indexing process. I'm hunting it now!! For the time being, I just disabled (committed to trunk 3x) the optimization that's causing the bug. Can you update to 3x head (or trunk head), remove your current index, and try again? Mike On Tue, Aug 3, 2010 at 8:52 AM, Paul Dlug paul.d...@gmail.com wrote: Thanks, I updated to the latest version with the fix but I'm now getting another error when optimizing the index (or when searching certain fields). It mentions unknown compression method but I'm not using compressed fields at all. SEVERE: java.io.IOException: background merge hit exception: _a:C248670/19645 _l:C206701/14563 _m:C12186/100 _n:C11356 _o:C9945 _p:C9000 _q:C5704 _r:C2214 _s:C2000 _t:C1264 into _u [optimize] [mergeDocStores] at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2392) at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:2320) at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:403) at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:85) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:169) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:69) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.lucene.index.CorruptIndexException: field data are in wrong format: java.util.zip.DataFormatException: unknown compression method at org.apache.lucene.index.FieldsReader.uncompress(FieldsReader.java:585) at org.apache.lucene.index.FieldsReader.addField(FieldsReader.java:357) at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:239) at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:894) at org.apache.lucene.index.IndexReader.document(IndexReader.java:684) at org.apache.lucene.index.SegmentMerger.copyFieldsWithDeletions(SegmentMerger.java:410) at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:338) at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:159) at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4053) at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3647) at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:339) at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:407) Caused by: java.util.zip.DataFormatException: unknown compression method at java.util.zip.Inflater.inflateBytes(Native Method) at java.util.zip.Inflater.inflate(Inflater.java:238) at java.util.zip.Inflater.inflate(Inflater.java:256) at
Re: wildcard and proximity searches
Frederico Azeiteiro wrote: But it is unusual to use both leading and trailing * operator. Why are you doing this? Yes I know, but I have a few queries that need this. I'll try the ReversedWildcardFilterFactory. ReverseWildcardFilter will help leading wildcard, but will not help trying to use a query with BOTH leading and trailing wildcard. it'll still be slow. Solr/lucene isn't good at that; I didn't even know Solr would do it at all in fact. If you really needed to do that, the way to play to solr/lucene's way of doing things, would be to have a field where you actually index each _character_ as a seperate token. Then leading and trailing wildcard search is basically reduced to a phrase search, but where the words are actually characters. But then you're going to get an index where pretty much every token belongs to every document, which Solr isn't that great at either, but then you can apply commongram stuff on top to help that out a lot too. Not quite sure what the end result will be, I've never tried it. I'd only use that weird special char as token field for queries that actually required leading and trailing wildcards. Figuring out how to set up your analyzers, and what (if anything) you're going to have to do client-app-side to transform the user's query into something that'll end up searching like a phrase search where each 'word' is a character is left as an exersize for the reader. :) Jonathan
Re: Sharing index files between multiple JVMs and replication
Yes, they are on a common file server, and I've been sharing the same index directory between the Solr JVMs. But I seem to be hitting a wall when attempting to use just one instance for changing the index. With Solr replication disabled, I stream updates to the one instance, and this process hangs whenever there are additional Solr JVMs started up with the same configuration in solrconfig.xml - So I then tried, to no avail, using a different configuration, solrconfig-readonly.xml where the updateHandler was commmented out, all /update* requestHandlers removed, mainIndex locktype of none, etc. And with Solr replication enabled, the Slave seems to hang, or at least report unusually long time estimates for the current running replication process to complete. -Kelly - Original Message From: Lance Norskog goks...@gmail.com To: solr-user@lucene.apache.org Sent: Tue, August 3, 2010 4:56:58 PM Subject: Re: Sharing index files between multiple JVMs and replication Are these files on a common file server? If you want to share them that way, it actually does work just to give them all the same index directory, as long as only one of them changes it. On Tue, Aug 3, 2010 at 4:38 PM, Kelly Taylor wired...@yahoo.com wrote: Is there a way to share index files amongst my multiple Solr web-apps, by configuring only one of the JVMs as an indexer, and the remaining, as read-only searchers? I'd like to configure in such a way that on startup of the read-only searchers, missing cores/indexes are not created, and updates are not handled. If I can get around the files being locked by the read-only instances, I should be able to scale wider in a given environment, as well as have less replicated copies of my master index (Solr 1.4 Java Replication). Then once the commit is issued to the slave, I can fire off a RELOAD script for each of my read-only cores. -Kelly -- Lance Norskog goks...@gmail.com
Re: min/max, StatsComponent, performance
Chris Hostetter wrote: Honestly: if you have a really small cardinality for these numeric values (ie: small enough to return every value on every request) perhaps you should use faceting to find the min/max values (with facet.mincount=1) instead of starts? Thanks for the tips and info. I can't figure out any way to use faceting to find min/max values. If I do a facet.sort=index, and facet.limit=1, then the facet value returned would be the min value... but how could I get the max value? There is no facet.sort=rindex or what have you. Ah, you say small enough to return every value on every request. Nope, it's not THAT small. I've got about 3 million documents, and 2-10k unique integers in a field, and I want to find the min/max. I guess, if I both index and store the field (which I guess i have to do anyway), I can find min and max via two separate queries. Sort by my_field asc, sort by my_field desc, with rows=1 both times, get out the stored field, that's my min/max. That might be what I resort to. But it's a shame, StatsComponent can give me the info included in the query I'm already making, as opposed to requiring two additional querries on top of that -- which you'd think would be _slower_, but doesn't in fact seem to be. I don't think so .. i belive Ryan considered this when he firsted added StatsComponent, but he decided it wasn't really worth the trouble -- all of the stats are computed in a single pass, and the majority of the time is spent getting the value of every doc in the set -- adding each value to a running total (for the sum and ultimatley computing the median) is a really cheap operation compared to the actaul iteration over the set. Yeah, it's really kind of a mystery to me why StatsComponent is being so slow. StatsComponent is slower than faceting on the field, and is even slower than the total time of: 1) First making the initial query, filling all caches, 2) Then making two additional querries with the same q/fq, but with different sorts to get min and max from the result set in #1. From what you say, there's no good reason for StatsComponent to be slower than these alternatives, but it is, by an order of magnitude (1-2 seconds vs 10-15 seconds). I guess I'd have to get into Java profiling/debugging to figure it out, maybe a weird bug or mis-design somewhere I'm tripping. Konathan
Re: StatsComponent and sint?
Thanks Hoss, the problem was transient, I believe that my index had become corrupted (changed the schema but hadn't fully deleted all documents that had been using the previous version of the schema), my fault.
Re: Duplicate a core
What I'm doing now is just adding the documents to the other core each night and deleting old documents from the other core when I'm finished. Is there a better way? On Tue, Aug 3, 2010 at 4:38 PM, Max Lynch ihas...@gmail.com wrote: Is it possible to duplicate a core? I want to have one core contain only documents within a certain date range (ex: 3 days old), and one core with all documents that have ever been in the first core. The small core is then replicated to other servers which do real-time processing on it, but the archive core exists for longer term searching. I understand I could just connect to both cores from my indexer, but I would like to not have to send duplicate documents across the network to save bandwidth. Is this possible? Thanks.
Re: analysis tool vs. reality
Did you reindex after changing the schema? On Aug 3, 2010, at 7:35 PM, Justin Lolofie wrote: Hi Erik, thank you for replying. So, turning on debugQuery shows information about how the query is processed- is there a way to see how things are stored internally in the index? My query is ABC12. There is a document who's title field is ABC12. However, I can only get it to match if I search for ABC or 12. This was also true in the analysis tool up until recently. However, I changed schema.xml and turned on catenate-all in WordDelimterFilterFactory for title fieldtype. Now, in the analysis tool ABC12 matches ABC12. However, when doing an actual query, it does not match. Thank you for any help, Justin -- Forwarded message -- From: Erik Hatcher erik.hatc...@gmail.com To: solr-user@lucene.apache.org Date: Tue, 3 Aug 2010 16:50:06 -0400 Subject: Re: analysis tool vs. reality The analysis tool is merely that, but during querying there is also a query parser involved. Adding debugQuery=true to your request will give you the parsed query in the response offering insight into what might be going on. Could be lots of things, like not querying the fields you think you are to a misunderstanding about some text not being analyzed (like wildcard clauses). Erik On Aug 3, 2010, at 4:43 PM, Justin Lolofie wrote: Hello, I have found the analysis tool in the admin page to be very useful in understanding my schema. I've made changes to my schema so that a particular case I'm looking at matches properly. I restarted solr, deleted the document from the index, and added it again. But still, when I do a query, the document does not get returned in the results. Does anyone have any tips for debugging this sort of issue? What is different between what I see in analysis tool and new documents added to the index? Thanks, Justin
Re: Multiple solr servers Vs Katta
Hi, I thought having around a TB of data to search is when katta should come into picture. Thanks a lot, can you please point me to or elaborate more on how to manage increasing index. Any standard strategies?
Re: Error indexing date
: do you know how can I do with my schema to some fields dont be necessary? : because in some cases it will be null and sometimes not.. You can say required=false on fields in your schema.xml -- but that won't change your situation (required=false is actually the default). The problem here isn't that Solr thinks those fields are neccessary the problem is that you are sending a string value (the empty string, aka ) and asking solr to index it in a date field, but it can't be parsed as a date. if you have documents for which the value is 'null' then the correct course of action is to not include that field in your document at all when sending it to Solr. -Hoss