Re: What is omitNorms
thanks for the link, i got lot information from this document. Can u please tell me how can i verify omitNorms effect in my document indexing or searching. - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/What-is-omitNorms-tp2987547p2987649.html Sent from the Solr - User mailing list archive at Nabble.com.
how can i index data in different documents
Hi, in my database i have two types of entity customer and product. I want to index customer related information in one document and product related information in other document. is it possible via solr , if so how can i achieve this. Thanks Regards Romi. - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/how-can-i-index-data-in-different-documents-tp2987696p2987696.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Single document scanning
First of all subscribe to the mailing list. - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/Single-document-scanning-tp2987614p2987705.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: What is omitNorms
When you say omitnorms=true for any fields it means SOLR will not store norms . AFAIK , if you do not store these norms then your index size would be smaller and will take less memory . You could safely omit these norms for smaller fields . i.e your indexing time is more. So if you do not store norms you save the memory Norms are used to boosts and field length normalization during indexing time so that short document has higher score Turning the norms on/off may depend on your indexing size and implementations I hope this helps .. thanks On Thu, May 26, 2011 at 11:48 AM, Romi romijain3...@gmail.com wrote: thanks for the link, i got lot information from this document. Can u please tell me how can i verify omitNorms effect in my document indexing or searching. - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/What-is-omitNorms-tp2987547p2987649.html Sent from the Solr - User mailing list archive at Nabble.com. -- Chandan Tamrakar * *
RE: Single document scanning
it seems, try again for better results - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/Single-document-scanning-tp2987614p2987788.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: What is omitNorms
Norms are used to boosts and field length normalization during indexing time so that short document has higher score How it is that if i set omitnorms=false for a field then short documents have higher score. i could not get this point , might be because i could not find any running example for this.would you please explain. - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/What-is-omitNorms-tp2987547p2987799.html Sent from the Solr - User mailing list archive at Nabble.com.
UniqueKey field in schema.xml
suppose I have two tables in database, say product table and customer table.i want to make (productID,customerID) a uniqueKey for my indexing document. how can i achieve this. - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/UniqueKey-field-in-schema-xml-tp2987807p2987807.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Spellcheck: Two dictionaries
?? -- View this message in context: http://lucene.472066.n3.nabble.com/Spellcheck-Two-dictionaries-tp2931458p2987915.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: problem in setting field attribute in schema.xml
ya...but when i set indexed=false for a particular field, and i search as *:* then it will search all documents thats true, but what i think is it should not contain the field which i set as indexed=true. for example in a document fields are id, author,title. and i for author field i set indexed=false, then author should not be indexed and when i perform search as *:* it should show all documents as doc string name= id id1/string string name=titlet1/string string name=authora1/string /doc Well, since I am only a beginner myself I have to say what my experience is - given that I have cleared my index, restarted, reindexed with new schema settings and do a restart (which is probably overdone) and if the schema I indexed with says indexed = false, stored=true for author and I search for author:a1 then I will get 0 results as I expect and if I search for id:id1 then it will show doc string name= id id1/string string name=titlet1/string string name=authora1/string /doc as I expect - is this what is happening for you? if it is happening and you are confused as to why I can't answer why on a technical level as I assume it is based on design decisions which I would agree don't seem sensible to me but is very probably based on some underlying technical reason that I am not familiar with. If you want to make sure that you do only see id and title in your result then either set stored = false for author (although why would you have a field that was both not stored and not indexed I don't know) or use the fl parameter on your request to give the list of fields you want returned - for example fl=id,title in the querystring for the request should mean you would just see string name= id id1/string string name=titlet1/string and not string name=authora1/string Best Regards, Bryan Rasmussen
Re: problem in setting field attribute in schema.xml
thanks a lot bryan: it might be again the repetition, but i just want to know WHY it is indexing the field when it is indexed=false, what if stored=true, it is clearly written in documentation that a field is search able only if it is indexed=true, which surely make sense. and my application is not saying to do so i am just experimenting with solr to learn it. want to clear my concepts about indexing. Thanks Romi - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/problem-in-setting-field-attribute-in-schema-xml-tp2984126p2988066.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: UniqueKey field in schema.xml
Create a new unique field for this purpose, like, myUniqueField, then, just combine (product-id+cust-id) and post it to this new field. -- View this message in context: http://lucene.472066.n3.nabble.com/UniqueKey-field-in-schema-xml-tp2987807p2988098.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: What is omitNorms
omitNorms=true on a field will have following effect: 1. length normalization will not work on the specific field-- Which means matching documents with shorter length will not be preferred/boost over matching documents with greater length for the specific field, at search time. 2. Index time boosting will not be available on the field. If, both the above cases are not required by you, then, you can set omitNorms=true for the specific fields. This has an added advantage, it will save you some(or a lot of) RAM also, since, with omitNorms=false on total N fields in the index will require RAM of size: Total docs in index * 1 byte * N -- View this message in context: http://lucene.472066.n3.nabble.com/What-is-omitNorms-tp2987547p2988124.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: FieldCache
This is because you may be having only 10 unique terms in your indexed Field. BTW, what do you mean by controlling the FieldCache? -- View this message in context: http://lucene.472066.n3.nabble.com/FieldCache-tp2987541p2988142.html Sent from the Solr - User mailing list archive at Nabble.com.
Problem with spellchecking, dont want multiple request to SOLR
Hello, First i will explain my situation. I have a 2 fields on my website: What and Where. When a user search i want spellcheck on both fields. Now i have 2 dictionaries, one for what and one for where. I want to search with one request and spellcheck both fields. Is it possible and how? -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-with-spellchecking-dont-want-multiple-request-to-SOLR-tp2988167p2988167.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: UniqueKey field in schema.xml
what do you mean by combine two fields customerID and ProductId. what i tried is 1. make both fields unique but it doesnot server my purpose 2. make a new field ID and copy both customerID , ProductId into ID using CopyField and now make ID as uniqueKey but i got a error saying: Document specifies multiple unique ids - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/UniqueKey-field-in-schema-xml-tp2987807p2988168.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: What is document tag in data-config.xml of Solr
document tag represents to the actual SOLR document that will be posted by the DIH. This mapping is used by the DIH to map DB-to-index document. You can have multiple entity tags, as you might be pulling data from more than 1 table. You can only have one document tag in you db-data-config.xml (remember, the purpose of db-data-config.xml is to map db-structure TO index-structure semantics) -- View this message in context: http://lucene.472066.n3.nabble.com/What-is-document-tag-in-data-config-xml-of-Solr-tp2978668p2988176.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: problem in setting field attribute in schema.xml
From my experience if it is indexing content that you have told it not to index that is because you haven't cleared your old indexed content. If you index something using schema version 5 which says indexed = true and then you change it to indexed = false you have to delete your old indexed content and reindex using the new schema, with lots of stopping and restarting involved. So - delete index, restart with new schema, index content with new schema. Best Regards, Bryan Rasmussen On Thu, May 26, 2011 at 11:24 AM, Romi romijain3...@gmail.com wrote: thanks a lot bryan: it might be again the repetition, but i just want to know WHY it is indexing the field when it is indexed=false, what if stored=true, it is clearly written in documentation that a field is search able only if it is indexed=true, which surely make sense. and my application is not saying to do so i am just experimenting with solr to learn it. want to clear my concepts about indexing. Thanks Romi - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/problem-in-setting-field-attribute-in-schema-xml-tp2984126p2988066.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Too many Boolean Clause and Filter Query
I'm sure you can fix this by increasing maxBooleanClauses value to some max. This shld apply to filter query as well -- View this message in context: http://lucene.472066.n3.nabble.com/Too-many-Boolean-Clause-and-Filter-Query-tp2974848p2988190.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: problem in setting field attribute in schema.xml
i deleted my index but what do u mean by restart with new schema?? - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/problem-in-setting-field-attribute-in-schema-xml-tp2984126p2988197.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: problem in setting field attribute in schema.xml
Well I'm probably being overly cautious here but its been my experience that if I have a schema that says indexed = true on a field and I change it to indexed = false I have to delete my index to get rid of everything that was indexed with the old schema and I have to restart to be able to index with the new schema. I've had the situation a number of times where I have changed the indexing rule for a field and not followed these steps and been surprised when my index does not follow my expectations - and it seems like you are experiencing the same thing. Best Regards, Bryan Rasmussen
How to access the content of a CopyField with Solrj?
Hi all, I have a catch-all field defined as a CopyField in the Schema and use a POJO to create the documents, thus the POJO doesn't include the catch-all field. The SolrDocuments retrieved contains the fields set up in the POJO and doesn't include the CopyFields. Is-it possible to access the content of a CopyField with Solrj? Any advice on this issue would be appreciated. Best, -- Jean-Claude Dauphin
Re: UniqueKey field in schema.xml
You concatenate the two keys into a single string, with some sort of delimiter between the two keys. François On May 26, 2011, at 6:05 AM, Romi wrote: what do you mean by combine two fields customerID and ProductId. what i tried is 1. make both fields unique but it doesnot server my purpose 2. make a new field ID and copy both customerID , ProductId into ID using CopyField and now make ID as uniqueKey but i got a error saying: Document specifies multiple unique ids - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/UniqueKey-field-in-schema-xml-tp2987807p2988168.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: problem in setting field attribute in schema.xml
i have done it, i deleted old indexes and created new indexes but still able to search it through *:*, and no result when i search it as field:value. really surprising result. :-O - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/problem-in-setting-field-attribute-in-schema-xml-tp2984126p2988256.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: UniqueKey field in schema.xml
I am not getting how can i combine two keys in to a single string using some delimiter - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/UniqueKey-field-in-schema-xml-tp2987807p2988284.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: UniqueKey field in schema.xml
Here is some code: -- final String key1 = 1; final String key2 = 2; final String masterKey = key1 + : + key2; -- You need to combine the keys *before* you send them to Solr. François On May 26, 2011, at 7:02 AM, Romi wrote: I am not getting how can i combine two keys in to a single string using some delimiter - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/UniqueKey-field-in-schema-xml-tp2987807p2988284.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: UniqueKey field in schema.xml
i might have misspelled the question. this is the entry in my db-data-config.xml file: entity name=torder query=select UID_PK,creationDate,email,confirmationCode from torder field column=creationDate name=date / entity name=torderattribute query=select UID_PK from torderattribute where orderUID='${torder.UID_PK}' field column=UID_PK name=UID / /entity /entity now i want combine UID_PK and UID for the uniqueKey of my indexing documet. i want to know how can i achieve this through schema.xml Thanks Romi - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/UniqueKey-field-in-schema-xml-tp2987807p2988323.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Out of memory on sorting
For saving Memory: 1. allocate as much memory to the JVM (especially if you are using 64bit OS) 2. You can set omitNorms=true for your date id fields (actually for all fields where index-time boosting length normalization isn't required. This will require a full reindex) 3. Are you sorting on all document available in index. Try to limit it using filter queries. 4. Avoid match all docs query like, q=*:* (if you are using this) 5. If you could do away with sorting on ID field, and sort on field with lesser unique terms Hope this helps -- View this message in context: http://lucene.472066.n3.nabble.com/Out-of-memory-on-sorting-tp2960578p2988336.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how to integrate solr with spring framework
Just read through: http://www.springbyexample.org/examples/solr-client.html http://static.springsource.org/spring-roo/reference/html/base-solr.html -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-integrate-solr-with-spring-framework-tp2955540p2988363.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: UniqueKey field in schema.xml
Romi, then you want to use the http://wiki.apache.org/solr/DataImportHandler#TemplateTransformer ? :) Regards Stefan Am 26.05.2011 13:17, schrieb Romi: i might have misspelled the question. this is the entry in my db-data-config.xml file: entity name=torder query=select UID_PK,creationDate,email,confirmationCode from torder field column=creationDate name=date / entity name=torderattribute query=select UID_PK from torderattribute where orderUID='${torder.UID_PK}' field column=UID_PK name=UID / /entity /entity now i want combine UID_PK and UID for the uniqueKey of my indexing documet. i want to know how can i achieve this through schema.xml Thanks Romi - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/UniqueKey-field-in-schema-xml-tp2987807p2988323.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: problem in setting field attribute in schema.xml
I guess you are indexing with property index=false , stored = true if it is , that means you are storing the value on index , so whenever you do *:* you can see the stored value for example if you have a field = ID, Customer_Name and you would only like to index customer_name because this is a field which users is going to search .. then you can just store ID in index without indexing . When customer names matches your index you would also like to show ID to users I do not know what is the purpose on your case . Store fields are usually required when you don't want to index but show on the search results. I hope its clear . You can try and experiment changing these values on a unique fields too .. thanks. On Thu, May 26, 2011 at 4:37 PM, Romi romijain3...@gmail.com wrote: i have done it, i deleted old indexes and created new indexes but still able to search it through *:*, and no result when i search it as field:value. really surprising result. :-O - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/problem-in-setting-field-attribute-in-schema-xml-tp2984126p2988256.html Sent from the Solr - User mailing list archive at Nabble.com. -- Chandan Tamrakar * *
Re: problem in setting field attribute in schema.xml
Am 26.05.2011 12:52, schrieb Romi: i have done it, i deleted old indexes and created new indexes but still able to search it through *:*, and no result when i search it as field:value. really surprising result. :-O I really don't understand your problem. Thist is not at all surprising but the expected behaviour: *:* just gives you every document in your index, no matter what of the document is stored or indexed, it just gives _everything_ whereas field:value does an actual search if there is an indexed value value in field field. So no surprise either that you didn't get a result here if you didn't index field. -Michael
Re: Terms Component - solr-1.4.0
Hi All, Please help me in implementing TermsComponent in my current Solr solution. Regards, Solr User On Tue, May 17, 2011 at 4:12 PM, Solr User solr...@gmail.com wrote: Hi All, I am using Solr 1.4.0 and dismax as request handler.I have the following in my solrconfig.xml in the dismax request handler tag arr name=last-components strspellcheck/str /arr The above tags helps to find terms if there are spelling issues. I tried configuring terms component and no luck. May I know how to configure terms component with dismax? or Do I need to call terms component directly to get auto suggestions? Thank you so much in advance. Regards, Solr User
Re: problem in setting field attribute in schema.xml
did u mean when i set indexed=false and store=true, solr does not index the field's value but store its value as it is??? - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/problem-in-setting-field-attribute-in-schema-xml-tp2984126p2988458.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Huge performance drop in distributed search w/ shards on the same server/container
Do you really require multi-shards? Single core/shard will do for even millions of documents and the search will be faster than searching on multi-shards. Consider multi-shard when you cannot scale-up on a single shard/machine(e.g, CPU,RAM etc. becomes major block). Also read through the SOLR distributed search wiki to check on all tuning up required at application server(Tomcat) end, like maxHTTP request settings. For a single request in a multi-shard setup internal HTTP requests are made through all queried shards, so, make sure you set this parameter higher. -- View this message in context: http://lucene.472066.n3.nabble.com/Huge-performance-drop-in-distributed-search-w-shards-on-the-same-server-container-tp2938421p2988464.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: UniqueKey field in schema.xml
I tried it as : entity name=torder query=select UID_PK,creationDate,email,confirmationCode from torder field column=creationDate name=date / entity name=torderattribute transformer=TemplateTransformer query=select UID_PK from torderattribute where orderUID='${torder.UID_PK}' field column=UID_PK h3template={torder.UID_PK},${torderattribute.UID_PK} / /entity /entity But i suppose it is not correct because here i am not mapping UID_PK of torderattribute to any of field in schema.xml. can i add like this: field column=UID_PK name= ID template={torder.UID_PK},${torderattribute.UID_PK} / where ID is a field in schema.xml and it is UniqueKey. - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/UniqueKey-field-in-schema-xml-tp2987807p2988484.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How does Solr's MoreLikeThis component internally work to get results?
This will help: http://cephas.net/blog/2008/03/30/how-morelikethis-works-in-lucene/ -- View this message in context: http://lucene.472066.n3.nabble.com/How-does-Solr-s-MoreLikeThis-component-internally-work-to-get-results-tp2938407p2988487.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: problem in setting field attribute in schema.xml
On Thu, May 26, 2011 at 2:10 PM, Romi romijain3...@gmail.com wrote: did u mean when i set indexed=false and store=true, solr does not index the field's value but store its value as it is??? Yes. So you can get back the value of all stored fields even if your search actually only finds results in indexed fields. It does seem somewhat counter-intuitive. Best Regards, Bryan Rasmussen
Re: problem in setting field attribute in schema.xml
Am 26.05.2011 14:10, schrieb Romi: did u mean when i set indexed=false and store=true, solr does not index the field's value but store its value as it is??? I don't know if you are asking me since you do not quote anything but yes of course this is exactly the purpose of indexed and stored. -Michael
RE: problem in setting field attribute in schema.xml
Hi Romi, as someone mentioned earlier already: indexed - The field value can be matched when you search on that field (field:some-value-to-match) stored -The field value can be retrieved from Solr in result sets (result docs can include that field and its value) @ Indexing in general: I think you will have to re-start Solr and/or re-index (maybe even delete / re-import) all your data after certain changes to your schema. Cannot formalize this any better, though, because I am an beginner myself. did u mean when i set indexed=false and store=true, solr does not index the field's value but store its value as it is???
Re: problem in setting field attribute in schema.xml
:), Thanks.. now i got the purpose of indexed and store. - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/problem-in-setting-field-attribute-in-schema-xml-tp2984126p2988506.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: problem in setting field attribute in schema.xml
Thanks for making me understand the concept of indexing and storing field. now i got the point :) - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/problem-in-setting-field-attribute-in-schema-xml-tp2984126p2988516.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: problem in setting field attribute in schema.xml
Yes as i said earlier . If you want to store the value of field as it is in index without Tokenizing . .for example customer_id which is a unique fields and you don't want to tokenize when you index a field you could tokenize the field values to index based on what tokenizer you use so that users can search .. On Thu, May 26, 2011 at 5:55 PM, Romi romijain3...@gmail.com wrote: did u mean when i set indexed=false and store=true, solr does not index the field's value but store its value as it is??? - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/problem-in-setting-field-attribute-in-schema-xml-tp2984126p2988458.html Sent from the Solr - User mailing list archive at Nabble.com. -- Chandan Tamrakar * *
Re: Too many Boolean Clause and Filter Query
We have increased the maxBoolean Clause now ,but since we have a number of instances on a single server and also number of ids that will get added to filter wll be increasing ...with no known limit ,I was wonderng f there was any other scalable method not affected by the maxboolean clause.. Also on looking at Manifold CF Documentation , not sure if this is any dfferent than ndexing user permssion to solr and filtering .Any body has done ths for permisssion based document flterng Regards Sujatha On Thu, May 26, 2011 at 3:47 PM, pravesh suyalprav...@yahoo.com wrote: I'm sure you can fix this by increasing maxBooleanClauses value to some max. This shld apply to filter query as well -- View this message in context: http://lucene.472066.n3.nabble.com/Too-many-Boolean-Clause-and-Filter-Query-tp2974848p2988190.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: UniqueKey field in schema.xml
Romi, first, you're missing a $ Sign for the first Variable. second, why not just field column=ID .. / ? The field-Tag has no ID, in case you're using the TemplateTransformer. Regards Stefan Am 26.05.2011 14:16, schrieb Romi: I tried it as : entity name=torder query=select UID_PK,creationDate,email,confirmationCode from torder field column=creationDate name=date / entity name=torderattribute transformer=TemplateTransformer query=select UID_PK from torderattribute where orderUID='${torder.UID_PK}' field column=UID_PK h3template={torder.UID_PK},${torderattribute.UID_PK} / /entity /entity But i suppose it is not correct because here i am not mapping UID_PK of torderattribute to any of field in schema.xml. can i add like this: field column=UID_PK name= ID template={torder.UID_PK},${torderattribute.UID_PK} / where ID is a field in schema.xml and it is UniqueKey. - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/UniqueKey-field-in-schema-xml-tp2987807p2988484.html Sent from the Solr - User mailing list archive at Nabble.com.
how can i index data in different documents
Hi, i was not getting reply for this post, so here i am reposting this, please reply. In my database i have two types of entity customer and product. I want to index customer related information in one document and product related information in other document. is it possible via solr , if so how can i achieve this. Thanks Regards Romi. - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/how-can-i-index-data-in-different-documents-tp2988621p2988621.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: UniqueKey field in schema.xml
first, you're missing a $ Sign for the first Variable. second, why not just field column=ID .. / ? The field-Tag has no ID, in case you're using the TemplateTransformer. I got my solution..thanks for it. but after looking your this reply..please make it clear: UID_PK is a column in my database table torderattribute, then I it is mandatory to map it in schema.xml,that is why i use field column=UID_PK name= ID ../ then how can i just use field column=ID .. / - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/UniqueKey-field-in-schema-xml-tp2987807p2988635.html Sent from the Solr - User mailing list archive at Nabble.com.
why use QueryElevationComponent
what is QueryElevationComponent, why it is used. in my schema.xml if i do not declare a uniqueKey then it shows the error org.apache.solr.common.SolrException: QueryElevationComponent requires the schema to have a uniqueKeyField why so ?? - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/why-use-QueryElevationComponent-tp2988675p2988675.html Sent from the Solr - User mailing list archive at Nabble.com.
Issue while extracting content from MS Excel 2007 file using TikaEntityProcessor
Hi All, I am using Solr 3.1 for one of our search based applications. We are using DIH to index our data and TikaEntityProcessor to index attachments. Currently we are running into an issue while extracting content from one of our MS Excel 2007 files, using TikaEntityProcessor. The issue is the TikaEntityProcessor is hung without throwing any exception which in tuen causes the indexing to be hung on the server. Has anyone faced a similar kind of issue in the past with TikaEntityProcessor ? Also, does someone know of a way to just skip this type of behaviour for that file and move to the next document to be indexed ? -- Thanks and Regards Rahul A. Warawdekar
Re: What is omitNorms
What would be the default value for omitNorms? --- Default value is false Is general advise to ignore this and set the value explicitly? --- Depends on your requirement. Do this on field-per-field basis. Set to false on fields where you want the norms, or, set to true on fields where you want to omit the norms -- View this message in context: http://lucene.472066.n3.nabble.com/What-is-omitNorms-tp2987547p2988714.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: What is omitNorms
Hi pravesh, Thanks for the quick reply. --Dmitry On Thu, May 26, 2011 at 4:27 PM, pravesh suyalprav...@yahoo.com wrote: What would be the default value for omitNorms? --- Default value is false Is general advise to ignore this and set the value explicitly? --- Depends on your requirement. Do this on field-per-field basis. Set to false on fields where you want the norms, or, set to true on fields where you want to omit the norms -- View this message in context: http://lucene.472066.n3.nabble.com/What-is-omitNorms-tp2987547p2988714.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Dmitry Kan
RE: SOLR Install
Thanks, Yuhan. I will look into both methods. Which is better or which method is recommended? How do I search through a database? Raj -Original Message- From: Yuhan Zhang [mailto:yzh...@onescreen.com] Sent: Monday, May 23, 2011 7:16 PM To: solr-user@lucene.apache.org Subject: Re: SOLR Install Hi Raj, To index files using java, use solrj: http://www.google.com/search?q=solrjie=utf-8oe=utf-8aq=trls=org.mozilla:en-US:officialclient=firefox-a To index files by a post request, follow this tutorial: http://www.xml.com/pub/a/2006/08/09/solr-indexing-xml-with-lucene-andrest.html Yuhan On Mon, May 23, 2011 at 7:10 AM, Roger Shah rs...@caci.com wrote: Hi, I am a new user and I have installed SOLR 3.1.0 and running Tomcat 7.0. I was able to run the example which shows the SOLR Admin screen. Also posted an XML file by this command from dos prompt: java -jar post.jar solr.xml. How can I get SOLR to search web sites and also search through other types of files, databases, etc? Instead of running the example that comes with SOLR, How do I create my own? Also can you point me to a SOLR Guide or documentation? I did not see any detailed documentation. Please show me where can I post messages on the SOLR web site. Thanks, Raj
Re: how can i index data in different documents
Hi Romi, A simple way to do so is to define in your schema.xml the union of all the columns you need plus a type field to distinguish your entities. eg, In your DB table1 : - col1 : varchar - col2 : int - col3 : float table2 : - col1 : int - col2 : varchar - col3 : int - col4 : varchar in solr's schema : field name=table1_col1 type=text field name=table1_col2 type=int field name=table1_col3 type=float field name=table2_col1 type=int field name=table2_col2 type=text field name=table2_col3 type=int field name=table2_col4 type=string field name=type type=string required=true multivalued=false Ensure that when you add your documents, their type value is effectively set to either table1 or table2. That's a possibility amongst others. -- Tanguy On 05/26/11 14:57, Romi wrote: Hi, i was not getting reply for this post, so here i am reposting this, please reply. In my database i have two types of entity customer and product. I want to index customer related information in one document and product related information in other document. is it possible via solr , if so how can i achieve this. Thanks Regards Romi. - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/how-can-i-index-data-in-different-documents-tp2988621p2988621.html Sent from the Solr - User mailing list archive at Nabble.com. -- -- Tanguy
Issue in Solr Indexing
Hi All, When i am Indexing the Record into the Solr it is successfully indexing and after that i am committing that commit is also showing successfully. but when i am going to search that particular record into the solr that time i am not getting that record from Solr. I am using Solr1.4.1 version. any one can please suggest why that particular record is not indexing.I am not getting any error from Catalina log file also. Thanks in advance. -- DEEPAK AGRAWAL +91-9379433455 GOOD LUCK.
Re: UniqueKey field in schema.xml
Stefen, as u can see entity name=torder query=select UID_PK,creationDate,email,confirmationCode from torder field column=creationDate name=date / entity name=torderattribute transformer=TemplateTransformer query=select UID_PK from torderattribute where orderUID='${torder.UID_PK}' field column=UID_PK h3template={torder.UID_PK},${torderattribute.UID_PK} / /entity /entity orderUID is a foreignkey for the table torderattribue which maps to UID_PK(pk) of torder. when i run the query select UID_PK from torderattribute where orderUID='${torder.UID_PK}' this should fetch multiple rows ,because corredponding to one orderUID there are multiple rows. but it is not. to make it more clear consider this: torder: UID_PK(primary key) 120 121 122 torderattribute UID_PK(primary key) orderUID 86 120 87120 89121 90121 91 121 and in my search result i got only(120,86),(121,89). i am missing 3 values in torderattribute. why so , please explain. Thanks Romi. - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/UniqueKey-field-in-schema-xml-tp2987807p2988774.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: how can i index data in different documents
Ensure that when you add your documents, their type value is effectively set to either table1 or table2. did you mean i set document name=d1 type=table1 in schema.xml??? but as far as i concern there can only be one document tag then what about the table2?? - Romi -- View this message in context: http://lucene.472066.n3.nabble.com/how-can-i-index-data-in-different-documents-tp2988621p2988789.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue in Solr Indexing
On Thu, May 26, 2011 at 7:06 PM, deepak agrawal dk.a...@gmail.com wrote: Hi All, When i am Indexing the Record into the Solr it is successfully indexing and after that i am committing that commit is also showing successfully. but when i am going to search that particular record into the solr that time i am not getting that record from Solr. I am using Solr1.4.1 version. Please provide us with more details, as there is not much to go on here: * How are you indexing? How are you telling that the indexing was successful? * How is the field defined in the Solr schema? * What is the commit response? any one can please suggest why that particular record is not indexing.I am not getting any error from Catalina log file also. [...] What does this mean? You say above that the indexing is successful, but seem to be saying here that it was not successful after all. Regards, Gora
Re: Termscomponent sort question
Hi Dmitry Kan, thanks for your anwser. This is an idea, but i think that will be not so performing. Because if the terms are 1000, i must reorder 1000 terms by own length, and i think the time will be high for make autocomplete. Don't you think? -- View this message in context: http://lucene.472066.n3.nabble.com/Termscomponent-sort-question-tp2980683p2988872.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: analyzer type - does it default to index or query?
(11/05/26 13:23), Andy wrote: Hi, When specifying an analyzer for a fieldType, I can say type=index or type=query What if I don't spcify the type for an analyzer? Does it default to index or query or both? Both. koji -- http://www.rondhuit.com/en/
RE: Spellcheck: Two dictionaries
Are you trying to do something like this: defType=dismaxqf=what whereq=(spellchek me with both diktionaries fur what and where) ?? If so, then I believe your only option is to create a third dictionary that combines what and where into one big uber-dictionary. Create a new field and copyField the values into it. Base your uber-dictionary on this new field. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: roySolr [mailto:royrutten1...@gmail.com] Sent: Thursday, May 26, 2011 3:24 AM To: solr-user@lucene.apache.org Subject: Re: Spellcheck: Two dictionaries ?? -- View this message in context: http://lucene.472066.n3.nabble.com/Spellcheck-Two-dictionaries-tp2931458p2987915.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: FieldCache
10 unique terms on 1.5M documents each with 50+ fields? I don't think so ;) What I mean is controlling its size like the other caches. There are currently no options in solrconfig.xml to control this cache. Is Solr/Lucene managing this all by itself? It could be that my understanding of the FieldCache is wrong. I thought this was the main cache for Lucene. Is that right? Thanks for your feedback -Original Message- From: pravesh [mailto:suyalprav...@yahoo.com] Sent: May-26-11 2:58 AM To: solr-user@lucene.apache.org Subject: Re: FieldCache This is because you may be having only 10 unique terms in your indexed Field. BTW, what do you mean by controlling the FieldCache? -- View this message in context: http://lucene.472066.n3.nabble.com/FieldCache-tp2987541p2988142.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr 3.1 without slf4j-jdk14-1.5.5.jar
If I'm not wrong, solrj uses slf4j for logging. slf4j-api.jar provides the api, but is not capable by itself to do the actual logging. For it to be able to log, it needs an actual implementation, usually a binding to some other logging library. slf4j-jdk14 is the binding that uses the logging API in the JDK (since v 1.4) to do the actual logging. Solrj needs slf4j-api and at one binding. You have to choose one and can exclude jars for other bindings. The options are: slf4j-log4j12 - binding to log4j library version 1.2. Delegates logging to log4j. slf4j-jdk14 - binding to JDK logging library (in JDK v 1.4 or greater). Delegates logging to the JDK. slf4j-nop - is a dummy implementation that silently discards all log messages slf4j-simple - is itself an implementation that logs messages to System.err (only messages of level INFO or higher). slf4j-jcl - binding for Jakarta Commons Logging library. Delegates logging to JCL. It's also documented a dependency to jcl-over-slf4j. This is quite the opposite of slf4j-jcl. While the latter implements slf4j api delegating logging to jcl, the former implements jcl api delegating logging to slf4j. I don't really think that solrj is using this (not sure). I believe that solrj uses slf4j. Needing jcl-over-slf4j would mean that some code in solrj does not use slf4j api but jcl api and needs also an implementation for it. If you take a look to maven repositories, there is no such dependency for solrj, so I guess it's not really needed. I hope I managed to explain it clearly. Cheers, Juan El 26/05/2011, a las 16:36, antonio escribió: Reading the wiki, for use solrj i must use this lib: From /lib •slf4j-jdk14-1.5.5.jar But there isn't no one directory call lib, and no one jar called slf4j-jdk14-1.5.5.jar . Is it necessary? When i can get it? -- View this message in context: http://lucene.472066.n3.nabble.com/solr-3-1-without-slf4j-jdk14-1-5-5-jar-tp2988950p2988950.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: FieldCache
fieldCache stores one entry for each field that is used for sorting or for field faceting when you use the fieldCache (fc) method. Before solr 1.4 the method for field faceting was the enum method that executes a filter query for each unique value of the field and stores it in the filterCache. From solr 1.4, the default method is fc, except for boolean fields, that use enum method by default. So, you should have an entry in fieldCache for each field that you use either for sorting or for field faceting with fc facet method. Does it match? I don't know a way to configure the size of the fieldCache. I don't know how much memory does each entry consume, either. Sorry not to be of further help. Cheers El 26/05/2011, a las 16:50, Jean-Sebastien Vachon escribió: 10 unique terms on 1.5M documents each with 50+ fields? I don't think so ;) What I mean is controlling its size like the other caches. There are currently no options in solrconfig.xml to control this cache. Is Solr/Lucene managing this all by itself? It could be that my understanding of the FieldCache is wrong. I thought this was the main cache for Lucene. Is that right? Thanks for your feedback -Original Message- From: pravesh [mailto:suyalprav...@yahoo.com] Sent: May-26-11 2:58 AM To: solr-user@lucene.apache.org Subject: Re: FieldCache This is because you may be having only 10 unique terms in your indexed Field. BTW, what do you mean by controlling the FieldCache? -- View this message in context: http://lucene.472066.n3.nabble.com/FieldCache-tp2987541p2988142.html Sent from the Solr - User mailing list archive at Nabble.com.
Question about SolrResponseBase.toString()
Hi, I'm working with Solrj, and I like to use the SolrResponseBase.toString() method, as it seems to return JSON. However, the JSON returned is not valid, as it misses quotes. If I search directly against Solr using http://localhost:8080/apache-solr-3.1-SNAPSHOT/select/?q=*%3A*version=2.2start=0rows=10indent=onwt=json I get this: { responseHeader:{ status:0, QTime:2, params:{ indent:on, start:0, q:*:*, wt:json, rows:10, version:2.2}}, When I search through the Solrj API, and to a SolrResponseBase.toString(), it looks like this: {responseHeader={status=0,QTime=3,params={facet=true,sort=mc_type asc,mc_id desc,facet.mincount=1,q=mc_facet_1:red,facet.limit=8,facet.field=[mc_facet_1, mc_facet_2, mc_facet_3, mc_facet_4, mc_facet_5, mc_facet_6, mc_facet_7],wt=xml,version=2.2}}, None of the fields are being quoted. Does anyone know how to get this to return valid JSON? Thanks, -Michiel -- Michiel Verkaik Software Architect rivetlogic http://www.rivetlogic.com/Voice646-217-0890Skype michiel.verkaik_rivetlogicGTalkmverkaik@rivetlogic.comCalendarmichiel verkaik's calendarhttp://www.google.com/calendar/hosted/rivetlogic.com/embed?src=mverkaik%40rivetlogic.comctz=America/New_York
RE: FieldCache
Since FieldCache is an expert level API in lucene, there is no direct control provided by SOLR/Lucene to control its size. -- View this message in context: http://lucene.472066.n3.nabble.com/FieldCache-tp2987541p2989443.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problem with spellchecking, dont want multiple request to SOLR
Yep, it's possible. Setup two spellcheckers, one named spellwhat and one named spellwhere and enable both on your searchRequestHandler. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 26. mai 2011, at 12.04, roySolr wrote: Hello, First i will explain my situation. I have a 2 fields on my website: What and Where. When a user search i want spellcheck on both fields. Now i have 2 dictionaries, one for what and one for where. I want to search with one request and spellcheck both fields. Is it possible and how? -- View this message in context: http://lucene.472066.n3.nabble.com/Problem-with-spellchecking-dont-want-multiple-request-to-SOLR-tp2988167p2988167.html Sent from the Solr - User mailing list archive at Nabble.com.
SOLR-2463 Null context in DIH
https://issues.apache.org/jira/browse/SOLR-2463 Haven't received any input/comments on this issue. Has anyone else witnessed this behavior? Thanks.
Re: Termscomponent sort question
Hi antonio, can you explain a bit more, how exactly have you implemented the autocomplete, is it with the terms component only? Does autocomplete operate on letter or word level? What does user type in for which the server returns both Rome and Near Rome? -- Dmitry On Thu, May 26, 2011 at 5:11 PM, antonio antonio...@email.it wrote: Hi Dmitry Kan, thanks for your anwser. This is an idea, but i think that will be not so performing. Because if the terms are 1000, i must reorder 1000 terms by own length, and i think the time will be high for make autocomplete. Don't you think? -- View this message in context: http://lucene.472066.n3.nabble.com/Termscomponent-sort-question-tp2980683p2988872.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Dmitry Kan
Nutch Crawl error
I ran the command bin/nutch crawl urls -dir crawl -depth 3 crawl.log When I viewed crawl.log I found some errors such as: Can't retrieve Tika parser for mime-typeapplication/x-shockwave-flash, and some other similar messages for other types such as application/xml, etc. Do I need to download Tika for these errors to go away? Where can I download Tika so that it can work with Nutch? If there are instructions to install Tika to work with Nutch please send them to me. Thanks, Roger
copyField of dates unworking?
Are there some sort of rules about what sort of fields can be copyFielded into other fields? My schema has (among other things): field name=date type=tdate indexed=true stored=true required=true / field name=user type=string indexed=true stored=true required=true / field name=text type=textgen indexed=true stored=true required=false multiValued=true / ... copyField source=user dest=text/ copyfield source=date dest=text/ The user field gets copied into text just fine, but the date field does not. In case they're handy, I've attached: - schema.xml - the complete schema - solr-usr-question.xml - a sample doc - solr-usr-answer.xml - the result in the searchbase -==- Jack Repenning Technologist Codesion Business Unit CollabNet, Inc. 8000 Marina Boulevard, Suite 600 Brisbane, California 94005 office: +1 650.228.2562 twitter: http://twitter.com/jrep schema.xml Description: XML document solr-usr-question.xml Description: XML document solr-usr-answer.xml Description: XML document
Re: Issue while extracting content from MS Excel 2007 file using TikaEntityProcessor
Can you rule out Tika or Solr by trying to parse the file with a stand-alone Tika? Hi All, I am using Solr 3.1 for one of our search based applications. We are using DIH to index our data and TikaEntityProcessor to index attachments. Currently we are running into an issue while extracting content from one of our MS Excel 2007 files, using TikaEntityProcessor. The issue is the TikaEntityProcessor is hung without throwing any exception which in tuen causes the indexing to be hung on the server. Has anyone faced a similar kind of issue in the past with TikaEntityProcessor ? Also, does someone know of a way to just skip this type of behaviour for that file and move to the next document to be indexed ?
Re: Returning documents using multi-valued field
Hi, maybe I wasn't so clear in my previous post. Here's another go (I'd like a reply :) ): Currently I'm issuing this query on Solr: http://localhost:9001/solrfacetsearch/master_Shop/select/?q=%28keyword_text_mv%3A%28alice+AND+trudy%29%29+AND+%28catalogId%3A%22Default%22%29+AND+%28catalogVersion%3AOnline%29start=0rows=2147483647facet=truefacet.field=category_string_mvsort=preferred_boolean+desc%2Cgeo_distance+ascfacet.mincount=1facet.limit=50facet.sort=indexradius=111.84681460272012long=5.2864094qt=geolat=52.2119418debugQuery=on where as you can see I'm searching for keywords Alice AND Trudy. This query returns a document which contains: arr name=keyword_text_mv stralice jill/str strtrudy alex/str /arr The problem is I'd like the document to be returned only if it contains a string alice trudy in one of its values, in other words, if it contains : arr name=keyword_text_mv stralice trudy/str strjill alex/str /arr How could I achieve this? I'm supporting the code written by someone else and I'm quite new to Solr. Thanks in advance :) Kurt On Wed, May 25, 2011 at 11:44 AM, Kurt Sultana kurtanat...@gmail.comwrote: Hi all, I'm quite new to Solr and I'm supporting an existing Solr search engine which was written by someone else. I've been reading on Solr for the last couple of weeks so I'd consider myself beyond the basics. A particular field, let's say name, is multi-valued. For example, a document has a field name with values Alice, Trudy. We want that the document is returned when Alice or Trudy is input and not when Alice Trudy is entered. Currently the document is even with Alice Trudy. How could this be done? Thanks a lot! Kurt
RE: Returning documents using multi-valued field
This is a limitation of Lucene/Solr in that there is no way to tell it to not match across mutli-valued field occurences. A workaround is to convert your query to a phrase and add a slop factor less than your posititonIncrementGap. ex: q=alice trudy~99 ... This example assumes that your positionIncrementGap is set to 100 (the default I think) or greater. This tells it that rather than search for a strict phrase, the words in the phrase can be up to 99 positions apart. Because the multi-valued fields are implemented under-the-covers by simply increasing the position of the next occurrence by the positionIncrementGap value, this will effectively prevent Lucene/Solr from matching across occurences. The downside to this workaround is that wildcards are not permitted in phrase searches. So if you need wildcard support also, then you're out of luck. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Kurt Sultana [mailto:kurtanat...@gmail.com] Sent: Thursday, May 26, 2011 3:05 PM To: solr-user@lucene.apache.org Subject: Re: Returning documents using multi-valued field Hi, maybe I wasn't so clear in my previous post. Here's another go (I'd like a reply :) ): Currently I'm issuing this query on Solr: http://localhost:9001/solrfacetsearch/master_Shop/select/?q=%28keyword_text_mv%3A%28alice+AND+trudy%29%29+AND+%28catalogId%3A%22Default%22%29+AND+%28catalogVersion%3AOnline%29start=0rows=2147483647facet=truefacet.field=category_string_mvsort=preferred_boolean+desc%2Cgeo_distance+ascfacet.mincount=1facet.limit=50facet.sort=indexradius=111.84681460272012long=5.2864094qt=geolat=52.2119418debugQuery=on where as you can see I'm searching for keywords Alice AND Trudy. This query returns a document which contains: arr name=keyword_text_mv stralice jill/str strtrudy alex/str /arr The problem is I'd like the document to be returned only if it contains a string alice trudy in one of its values, in other words, if it contains : arr name=keyword_text_mv stralice trudy/str strjill alex/str /arr How could I achieve this? I'm supporting the code written by someone else and I'm quite new to Solr. Thanks in advance :) Kurt On Wed, May 25, 2011 at 11:44 AM, Kurt Sultana kurtanat...@gmail.comwrote: Hi all, I'm quite new to Solr and I'm supporting an existing Solr search engine which was written by someone else. I've been reading on Solr for the last couple of weeks so I'd consider myself beyond the basics. A particular field, let's say name, is multi-valued. For example, a document has a field name with values Alice, Trudy. We want that the document is returned when Alice or Trudy is input and not when Alice Trudy is entered. Currently the document is even with Alice Trudy. How could this be done? Thanks a lot! Kurt
Re: copyField of dates unworking?
it seems like reserved key words can't be used as field names did you try to changes your date field name? On Thu, May 26, 2011 at 9:54 PM, Jack Repenning jrepenn...@collab.netwrote: Are there some sort of rules about what sort of fields can be copyFielded into other fields? My schema has (among other things): field name=date type=tdate indexed=true stored=true required=true / field name=user type=string indexed=true stored=true required=true / field name=text type=textgen indexed=true stored=true required=false multiValued=true / ... copyField source=user dest=text/ copyfield source=date dest=text/ The user field gets copied into text just fine, but the date field does not. In case they're handy, I've attached: - schema.xml - the complete schema - solr-usr-question.xml - a sample doc - solr-usr-answer.xml - the result in the searchbase -==- Jack Repenning Technologist Codesion Business Unit CollabNet, Inc. 8000 Marina Boulevard, Suite 600 Brisbane, California 94005 office: +1 650.228.2562 twitter: http://twitter.com/jrep -- Anass
Question about the purpose of reindexing
I've been trying to find a concise explanation of this, and seem to so far have missed it. (Google, etc). What is the purpose/need to reindex a solr index? How do you determine what provides the best performance? What detrimental affects occur if you operate off of delta indexes? Aaron Chmelik Web Designer Programmer email: aaron.chme...@gmail.com website: http://webdesign.aaronchmelik.com phone: 651.757.5979
commit configuration
Hi, I'm using DIH and want to perform commits each N processed document, how can I do this? thanks in advance -- Anass
Re: Question about the purpose of reindexing
Define reindexing. Every new document is indexed and existing documents are deleted and indexed as if it is a new document. Completely reindexing from scratch is only required if breaking changes are made to the schema or if you upgrade to a new version that uses another format and isn't able to convert. To my knowledge and experience, it is perfectly sane to never do a complete reindex and still accept incremental updates. Just do an optimize once in a while and you're good to go. I've been trying to find a concise explanation of this, and seem to so far have missed it. (Google, etc). What is the purpose/need to reindex a solr index? How do you determine what provides the best performance? What detrimental affects occur if you operate off of delta indexes? Aaron Chmelik Web Designer Programmer email: aaron.chme...@gmail.com website: http://webdesign.aaronchmelik.com phone: 651.757.5979
Re: commit configuration
http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/conf/solrconfig.xml Look for autocommit and maxDocs. Hi, I'm using DIH and want to perform commits each N processed document, how can I do this? thanks in advance
Re: copyField of dates unworking?
On May 26, 2011, at 1:55 PM, anass talby wrote: it seems like reserved key words can't be used as field names did you try to changes your date field name? Interesting thought, but it didn't seem to help. I changed the schema so it has both a date and a eventDate field (so as not to invalidate my current data), and changed the copyField statement to from=eventDate. Then I added an eventData field to the test document mentioned earlier, with a one-second difference so I could be sure which was which. I added that doc, but the text field still doesn't have either date field. Any other thoughts why I can't copyField a date into a textgen? { responseHeader:{ status:0, QTime:5, params:{ indent:on, start:0, q:text:\example for list question\, version:2.2, rows:10}}, response:{numFound:1,start:0,docs:[ { id:jackrepenningdev-p1-svn-solr-user-question-1, item:r10, itemNumber:10, user:jackrepenning, date:2011-05-26T20:34:19Z, eventDate:2011-05-26T20:34:20Z, log:example for list question, organization:jackrepenningdev, project:p1, system:versioncontrol, subsystem:svn, class:operation, className:commit, text:[ r10, jackrepenning, M /trunk/cvsdude/solr/conf/schema.xml, example for list question], paths:[/trunk/cvsdude/solr/conf/schema.xml], changes:[M /trunk/cvsdude/solr/conf/schema.xml]}] }} -==- Jack Repenning Technologist Codesion Business Unit CollabNet, Inc. 8000 Marina Boulevard, Suite 600 Brisbane, California 94005 office: +1 650.228.2562 twitter: http://twitter.com/jrep PGP.sig Description: This is a digitally signed message part
Re: Question about the purpose of reindexing
One more question - what does optimization do? Maybe to be a little more precise - what happens to the index that requires optimizaion (what is the problem and how does optimization solve it). Aaron Chmelik Web Designer Programmer email: aaron.chme...@gmail.com website: http://webdesign.aaronchmelik.com phone: 651.757.5979 On Thu, May 26, 2011 at 4:27 PM, Aaron Chmelik aaron.chme...@gmail.comwrote: Thanks! coming from Sphinx, with seperate delta index files. Thanks for the info! Aaron Chmelik Web Designer Programmer email: aaron.chme...@gmail.com website: http://webdesign.aaronchmelik.com phone: 651.757.5979 On Thu, May 26, 2011 at 4:21 PM, Markus Jelsma markus.jel...@openindex.io wrote: Define reindexing. Every new document is indexed and existing documents are deleted and indexed as if it is a new document. Completely reindexing from scratch is only required if breaking changes are made to the schema or if you upgrade to a new version that uses another format and isn't able to convert. To my knowledge and experience, it is perfectly sane to never do a complete reindex and still accept incremental updates. Just do an optimize once in a while and you're good to go. I've been trying to find a concise explanation of this, and seem to so far have missed it. (Google, etc). What is the purpose/need to reindex a solr index? How do you determine what provides the best performance? What detrimental affects occur if you operate off of delta indexes? Aaron Chmelik Web Designer Programmer email: aaron.chme...@gmail.com website: http://webdesign.aaronchmelik.com phone: 651.757.5979
Re: Question about the purpose of reindexing
Optimizing an index forces segments to merge. Usually, segments are merged automatically based on your mergeFactor setting. During a merge documents flagged for deletion are really purged and the number of segments is reduces which improves search performance. There are some good pages on mergeFactor around. Haven't got a link at hand. One more question - what does optimization do? Maybe to be a little more precise - what happens to the index that requires optimizaion (what is the problem and how does optimization solve it). Aaron Chmelik Web Designer Programmer email: aaron.chme...@gmail.com website: http://webdesign.aaronchmelik.com phone: 651.757.5979 On Thu, May 26, 2011 at 4:27 PM, Aaron Chmelik aaron.chme...@gmail.comwrote: Thanks! coming from Sphinx, with seperate delta index files. Thanks for the info! Aaron Chmelik Web Designer Programmer email: aaron.chme...@gmail.com website: http://webdesign.aaronchmelik.com phone: 651.757.5979 On Thu, May 26, 2011 at 4:21 PM, Markus Jelsma markus.jel...@openindex.io wrote: Define reindexing. Every new document is indexed and existing documents are deleted and indexed as if it is a new document. Completely reindexing from scratch is only required if breaking changes are made to the schema or if you upgrade to a new version that uses another format and isn't able to convert. To my knowledge and experience, it is perfectly sane to never do a complete reindex and still accept incremental updates. Just do an optimize once in a while and you're good to go. I've been trying to find a concise explanation of this, and seem to so far have missed it. (Google, etc). What is the purpose/need to reindex a solr index? How do you determine what provides the best performance? What detrimental affects occur if you operate off of delta indexes? Aaron Chmelik Web Designer Programmer email: aaron.chme...@gmail.com website: http://webdesign.aaronchmelik.com phone: 651.757.5979
Re: Question about the purpose of reindexing
Thanks. I think I can take it form there! Aaron Chmelik Web Designer Programmer email: aaron.chme...@gmail.com website: http://webdesign.aaronchmelik.com phone: 651.757.5979 On Thu, May 26, 2011 at 4:51 PM, Markus Jelsma markus.jel...@openindex.iowrote: Optimizing an index forces segments to merge. Usually, segments are merged automatically based on your mergeFactor setting. During a merge documents flagged for deletion are really purged and the number of segments is reduces which improves search performance. There are some good pages on mergeFactor around. Haven't got a link at hand. One more question - what does optimization do? Maybe to be a little more precise - what happens to the index that requires optimizaion (what is the problem and how does optimization solve it). Aaron Chmelik Web Designer Programmer email: aaron.chme...@gmail.com website: http://webdesign.aaronchmelik.com phone: 651.757.5979 On Thu, May 26, 2011 at 4:27 PM, Aaron Chmelik aaron.chme...@gmail.comwrote: Thanks! coming from Sphinx, with seperate delta index files. Thanks for the info! Aaron Chmelik Web Designer Programmer email: aaron.chme...@gmail.com website: http://webdesign.aaronchmelik.com phone: 651.757.5979 On Thu, May 26, 2011 at 4:21 PM, Markus Jelsma markus.jel...@openindex.io wrote: Define reindexing. Every new document is indexed and existing documents are deleted and indexed as if it is a new document. Completely reindexing from scratch is only required if breaking changes are made to the schema or if you upgrade to a new version that uses another format and isn't able to convert. To my knowledge and experience, it is perfectly sane to never do a complete reindex and still accept incremental updates. Just do an optimize once in a while and you're good to go. I've been trying to find a concise explanation of this, and seem to so far have missed it. (Google, etc). What is the purpose/need to reindex a solr index? How do you determine what provides the best performance? What detrimental affects occur if you operate off of delta indexes? Aaron Chmelik Web Designer Programmer email: aaron.chme...@gmail.com website: http://webdesign.aaronchmelik.com phone: 651.757.5979
Re: Issue while extracting content from MS Excel 2007 file using TikaEntityProcessor
Hi Markus, It is Tika. I tried using tika standalone. On 5/26/11, Markus Jelsma markus.jel...@openindex.io wrote: Can you rule out Tika or Solr by trying to parse the file with a stand-alone Tika? Hi All, I am using Solr 3.1 for one of our search based applications. We are using DIH to index our data and TikaEntityProcessor to index attachments. Currently we are running into an issue while extracting content from one of our MS Excel 2007 files, using TikaEntityProcessor. The issue is the TikaEntityProcessor is hung without throwing any exception which in tuen causes the indexing to be hung on the server. Has anyone faced a similar kind of issue in the past with TikaEntityProcessor ? Also, does someone know of a way to just skip this type of behaviour for that file and move to the next document to be indexed ? -- Thanks and Regards Rahul A. Warawdekar
highlighting in multiValued field
Hi All, I am having a problem with search highlighting for multiValued fields and am wondering if someone can point me in the right direction. I have in my schema a multiValued field as such: field name=description type=text stored=true indexed=true multiValued=true/ When I search for term Tel, it returns me the correct doc: doc ... arr name=description strTel to talent 1/str strTel to talent 2/str /arr ... /doc When I enable highlighting, it returns me the following highlight with only one vector returned: ... lst name=highlighting lst name=1 arr name=description stremTel/em to talent 1/str /arr /lst /lst What I'm expecting is actually both vectors to be returned as such: lst name=highlighting lst name=1 arr name=description stremTel/em to talent 1/str stremTel/em to talent 2/str /arr /lst /lst Am I doing something wrong in my config or query (I'm using default)? Any help is appreciated. Thanks, Jeff
RE: highlighting in multiValued field
What is your actual query? Did you look at the hl.snippets parameter? Bob Sandiford | Lead Software Engineer | SirsiDynix P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com www.sirsidynix.com Join the conversation - you may even get an iPad or Nook out of it! Like us on Facebook! Follow us on Twitter! -Original Message- From: Jeffrey Chang [mailto:jclal...@gmail.com] Sent: Thursday, May 26, 2011 11:10 PM To: solr-user@lucene.apache.org Subject: highlighting in multiValued field Hi All, I am having a problem with search highlighting for multiValued fields and am wondering if someone can point me in the right direction. I have in my schema a multiValued field as such: field name=description type=text stored=true indexed=true multiValued=true/ When I search for term Tel, it returns me the correct doc: doc ... arr name=description strTel to talent 1/str strTel to talent 2/str /arr ... /doc When I enable highlighting, it returns me the following highlight with only one vector returned: ... lst name=highlighting lst name=1 arr name=description stremTel/em to talent 1/str /arr /lst /lst What I'm expecting is actually both vectors to be returned as such: lst name=highlighting lst name=1 arr name=description stremTel/em to talent 1/str stremTel/em to talent 2/str /arr /lst /lst Am I doing something wrong in my config or query (I'm using default)? Any help is appreciated. Thanks, Jeff
RE: highlighting in multiValued field
The only thing I can think of is to post-process your snippets. I.E. pull the highlighting tags out of the strings, look for the match in your result description field looking for a match, and if you find one, replace that description with the original highlight text (i.e. with the highlight tags still in place). Bob Sandiford | Lead Software Engineer | SirsiDynix P: 800.288.8020 X6943 | bob.sandif...@sirsidynix.com www.sirsidynix.com Join the conversation - you may even get an iPad or Nook out of it! Like us on Facebook! Follow us on Twitter! -Original Message- From: Jeffrey Chang [mailto:jclal...@gmail.com] Sent: Friday, May 27, 2011 12:16 AM To: solr-user@lucene.apache.org Subject: Re: highlighting in multiValued field Hi Bob, I have no idea how I missed that! Thanks for pointing me to use hl.snippets - that did the magic! Please allow me squeeze one more question along the same line. Since I'm now able to display multiple snippets - what I'm trying to achieve is, determine which highlighted snippet maps back to what position in the original document. e.g. If I search for Tel, with highlighting and hl.snippets=2 it'll return me: doc ... arr name=descID str1/str str2/str str3/str /arr arr name=description strTel to talent 1/str strTel to talent 2/str strTel to talent 3/str /arr ... /doc lst name=highlighting lst name=1 arr name=description stremTel/em to talent 1/str stremTel/em to talent 2/str /arr /lst ... Is there a way for me to figure out which highlighted snippet belongs to which descID so I can display also display the non-highlighted rows for my search results. Or is this not the way how highlighting is designed and to be used? Thanks so much, Jeff [snip]
Query regarding Solr-2242 patch for getting distinct facet counts.
The patch solr 2242 for getting count of distinct facet terms doesn't work for distributedProcess (https://issues.apache.org/jira/browse/SOLR-2242) The error log says HTTP ERROR 500 Problem accessing /solr/select. Reason: For input string: numFacetTerms java.lang.NumberFormatException: For input string: numFacetTerms at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:403) at java.lang.Long.parseLong(Long.java:461) at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:331) at org.apache.solr.schema.TrieField.toInternal(TrieField.java:344) at org.apache.solr.handler.component.FacetComponent$DistribFieldFacet.add(FacetComponent.java:619) at org.apache.solr.handler.component.FacetComponent.countFacets(FacetComponent.java:265) at org.apache.solr.handler.component.FacetComponent.handleResponses(FacetComponent.java:235) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) The query I passed : http://localhost:8983/solr/select?q=*:*facet=truefacet.field=2facet.field=648facet.mincount=1facet.limit=-1f.2.facet.numFacetTerms=1rows=0shards=localhost:8983/solr,localhost:8985/solrtwo Anyone can suggest me the changes i need to make to enable the same funcionality for shards? When i do it across single core.. I get the correct results. I have applied the solr 2242 patch in solr1.4.1 Awaiting for reply Regards, Rajani
solr Invalid Date in Date Math String/Invalid Date String
Hi all I am using SOLR 1.4.1 (according to solr info), but no matter what date field I use (date or tdate) defined in default schema.xml, I cannot do a search in solr-admin analysis.jsp: fieldtype: date(or tdate) fieldvalue(index): 2006-12-22T13:52:13Z (I type it in manually, no trailing space) fieldvalue(query): The only success case: 2006-12-22T13:52:13Z All search below are failed: * TO NOW [* TO NOW] 2006-12-22T00:00:00Z TO 2006-12-22T23:59:59Z 2006\-12\-22T00\:00\:00Z TO 2006\-12\-22T23\:59\:59Z [2006-12-22T00:00:00Z TO 2006-12-22T23:59:59Z] [2006\-12\-22T00\:00\:00Z TO 2006\-12\-22T23\:59\:59Z] 2006-12-22T00:00:00.000Z TO 2006-12-22T23:59:59.999Z 2006\-12\-22T00\:00\:00\.000Z TO 2006\-12\-22T23\:59\:59\.999Z [2006-12-22T00:00:00.000Z TO 2006-12-22T23:59:59.999Z] [2006\-12\-22T00\:00\:00\.000Z TO 2006\-12\-22T23\:59\:59\.999Z] 2006-12-22T00:00:00Z TO * 2006\-12\-22T00\:00\:00Z TO * [2006-12-22T00:00:00Z TO *] [2006\-12\-22T00\:00\:00Z TO *] 2006-12-22T00:00:00.000Z TO * 2006\-12\-22T00\:00\:00\.000Z TO * [2006-12-22T00:00:00.000Z TO *] [2006\-12\-22T00\:00\:00\.000Z TO *] (vice versa) I get either: Invalid Date in Date Math String or Invalid Date String error What's wrong with it? Can anyone please help me on that? Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-Invalid-Date-in-Date-Math-String-Invalid-Date-String-tp2991763p2991763.html Sent from the Solr - User mailing list archive at Nabble.com.