Re: Sole core naming convention for multicores
solr will not give any exceptions .atleast ,there is no code which checks for that. choose names which are valid characters in url On Mon, May 18, 2009 at 11:08 AM, KK dioxide.softw...@gmail.com wrote: Thank you Otis. One silly question, how would I know that a particular character is forbidden, I think Solr will give me exceptions saying that some characters not allowed, right? Thank, KK. On Sun, May 17, 2009 at 3:12 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: KK, That should work just fine. Should any of the characters in email addresses turn out to be forbidden, just replace them consistently. For example, if @ turns out to be the problem, you could simple replace it with _. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: KK dioxide.softw...@gmail.com To: solr-user@lucene.apache.org Sent: Saturday, May 16, 2009 3:45:01 AM Subject: Sole core naming convention for multicores Hi All, I'm trying to put multicores for Solr[lol, finding the multicore config a bit difficult, any good/simple steps to do the same?any pointers]. Let me come to the point, essentially what I want is that whenever a person registersfor our service, I'll use his mail-id[this is unique] as the corename. I dont know if its viable or not. As per the wiki example the creation/registration of new core is done like this, http://localhost:8983/solr/admin/cores?action=CREATEname=coreXinstanceDir=path_to_instance_directoryconfig=config_file_name.xmlschema=schem_file_name.xmldataDir=data this says the name as something like coreX where X replaces a num. Is it possible to have a name like say alex...@abc.com? If not may be I've map the mail-id to some unique number that I'll use as a core name. I don't want to do all this [don't know either], hence my question. Do let me know some smart ways of doing the same. Note: I've to use mail-id as the unique identifier. Thanks in appreciation. Thanks, KK -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Simplest way of registering new solr core!
Hi, What is the simplest way of registering a new solr core? Do we have to use some standard APIs for this, or making a Ajax get call to http://localhost:8983/solr/admin/cores with proper request parameter lik these, [?action=CREATEname=coreXinstanceDir=path_to_instance_directoryconfig=config_file_name.xmlschema=schem_file_name.xmldataDir=data] will do the job. I think this should work. Correct me if I'm wrong The process remains same for other requests as well[ like getting core status, reload, rename etc], right? Also, I would be thankful if someone can point me to some good tutorial on adding new cores to the existing ones, configuring multicore indexing etc. I don't find the solr wiki that useful for doing the same. Thanks, KK 2009/5/14 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com usually it happens fast . and the request returns only after the cor creation is completed. So if you are doing all the operations in the same thread no need to wait. On Thu, May 14, 2009 at 7:37 PM, KK dioxide.softw...@gmail.com wrote: Thank you very much. LOL, Its in the same wiki I was told to go through. I've a question regarding creating ofsolr cores on the fly. The wiki says, .Creates a new core and register it. If persistence is enabled (persist=true), the configuration for this new core will be saved in 'solr.xml'. If a core with the same name exists, while the new created core is initializing, the old one will continue to accept requests. Once it has finished, all new request will go to the new core, and the old core will be unloaded. So I've to wait for some time [say a couple of secs, may be less than that] before I start adding pages to that core. I think this is the way to handle it , otherwise some content which should have been indexed by the new core, will get indexed by the existing core[as the wiki says], which I don't want to happen. Any other ideas for handling the same. Thanks, KK. 2009/5/14 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com Solr already supports this . please refer this http://wiki.apache.org/solr/CoreAdmin#head-7ca1b98a9df8b8ca0dcfbfc49940ed5ac98c4a08 ensure that your solr.xml is persistent http://wiki.apache.org/solr/CoreAdmin#head-7508c24c6e2dadad2dfea39b2fba045062481da8 On Thu, May 14, 2009 at 3:43 PM, KK dioxide.softw...@gmail.com wrote: Thank you very much. Got the point. One off the track question, can we automate the creation of new cores[it requires manually editing the solr.xml file as I know, and what about the location of core index directory, do we need to point that manually as well]. After going through the wiki what I found is we've to mention the names of cores in solr.xml. I want to automate the process in such a way that when a user registers[ on say my site for the service], we'll create a coresponding core for the same user and with a specific core id[unique for this user only] so that the user will be given a search interface that will redirect all searches for this user to http://host:port/unique core name for this user/select Will apprecite any ideas on this. Thanks, KK. 2009/5/14 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com there is no hard limit on the no:of cores. it is limited by your system's ability to open files and the resources. the queries are automatically sent to appropriate core if your url is htt://host:port/corename/select On Thu, May 14, 2009 at 1:58 PM, KK dioxide.softw...@gmail.com wrote: I want to know the maximum no of cores supported by Solr. 1000s or may be millions all under one solr instance ? Also I want to know how to redirect a particular query to a particular core. Actually I'm querying solr from Ajax, so I think there must be some request parameter that says which core we want to query, right? Can some one tell me how to do this, any good pointers on the same will be helpful as well. Thank you. --kk -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Simple search returns no documents
Hi Otis, That seems to be the problem. The term positions for the index are 1 and 3. While the positions for the query are 1 and 2. The StopFilterTokenizer was set with setEnablePositionIncrementsDefault=true for the index analyzer while it was not enabled for the query analyzer. Thanks! -- Jeffrey Gelens On Saturday 16 May 2009 01:55:07 Otis Gospodnetic wrote: Hi Jeffrey, And now try: ?q=facility_indexed:kooklessen en workshops~1 If that works, head over to the Solr Admin Analysis page, enter the field name, and that phrase for both index and query analyzer. And then look at term positions for your two main terms/tokens. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Jeffrey Gelens jeffrey.gel...@buyways.nl To: solr-user@lucene.apache.org Sent: Friday, May 15, 2009 10:47:00 AM Subject: Simple search returns no documents Hello all, I've got some weird problem with a simple field search. The field facility_indexed has the following terms: - kooklessen (freq: 422) - workshop (freq: 422) These terms were tokenized from the string: Kooklessen en Workshops. So during insertion in Solr, the string was succesfully indexed. When I send the following query to Solr: ?q=facility_indexed:kooklessen en workshops Solr returns no documents at all. I checked the tokenization of this query in the analyser and it breaks down the same way as when it's indexed (so as kooklessen and workshop). If I query as follows: (?q=facility_indexed:kooklessen) then Solr DOES return 422 documents, which is correct. The same result when I query only for 'workshops' Querying other strings work correct, it's only this particular string which causes this weird problem. Any help for debugging or of course a possible solution would be great! ;-) Thanks! -- Jeffrey Gelens
Howto? Applying a filter across schema fileds using state information
Hi, I need to write a filter that extracts information from the content of one filed (say the Body field) and then applies some transformation based on this content, to a *different* filed (say: the Title field) is this possible ? Example: I will find certain keywords in the body and then locate them and transform them in the title -- View this message in context: http://www.nabble.com/Howto--Applying-a-filter-across-schema-fileds-using-state-information-tp23593424p23593424.html Sent from the Solr - User mailing list archive at Nabble.com.
Howto? Obtain the IndexReader from within a solr filter
Hi, I am writing a solr filter that needs the DocFreq of each Token in order to decide what to do with it. What is the easiest way to obtain this information from within the filter code ? thanks, Yatir -- View this message in context: http://www.nabble.com/Howto--Obtain-the-IndexReader-from-within-a-solr-filter-tp23593475p23593475.html Sent from the Solr - User mailing list archive at Nabble.com.
Query with AND|OR operator with Dismaxrequest
Hi, I am not getting correct results with a Query which has multiple AND | OR operator. Query Format q=((A AND B) OR (C OR D) OR E) ?q=((intAgeFrom_product_i:[0+TO+3]+AND+intAgeTo_product_i:[3+TO+*])+OR+(intAgeFrom_product_i:[0+TO+3]+AND+intAgeTo_product_i:[0+TO+3])+OR+(ageFrom_product_s:Adult))qt=dismaxrequest Query return correct result without Dismaxrequest, but incorrect results with Dismaxrequest. I have to use dismaxrequest because i need boosting of search results According to some posts there are issues with AND | OR operator with dismaxrequest. Please let me know if anyone has faced the same problem and if there is any way to make the query work with dismaxrequest. Thanks, Prerna -- View this message in context: http://www.nabble.com/Query-with-AND%7COR-operator-with-Dismaxrequest-tp23594592p23594592.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Simplest way of registering new solr core!
Do we have to extend the CoreAdminHandler class and use some methods therein to register a new core? Thanks in appreciation. --KK On Mon, May 18, 2009 at 12:48 PM, KK dioxide.softw...@gmail.com wrote: Hi, What is the simplest way of registering a new solr core? Do we have to use some standard APIs for this, or making a Ajax get call to http://localhost:8983/solr/admin/cores with proper request parameter lik these, [?action=CREATEname=coreXinstanceDir=path_to_instance_directoryconfig=config_file_name.xmlschema=schem_file_name.xmldataDir=data] will do the job. I think this should work. Correct me if I'm wrong The process remains same for other requests as well[ like getting core status, reload, rename etc], right? Also, I would be thankful if someone can point me to some good tutorial on adding new cores to the existing ones, configuring multicore indexing etc. I don't find the solr wiki that useful for doing the same. Thanks, KK 2009/5/14 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com usually it happens fast . and the request returns only after the cor creation is completed. So if you are doing all the operations in the same thread no need to wait. On Thu, May 14, 2009 at 7:37 PM, KK dioxide.softw...@gmail.com wrote: Thank you very much. LOL, Its in the same wiki I was told to go through. I've a question regarding creating ofsolr cores on the fly. The wiki says, .Creates a new core and register it. If persistence is enabled (persist=true), the configuration for this new core will be saved in 'solr.xml'. If a core with the same name exists, while the new created core is initializing, the old one will continue to accept requests. Once it has finished, all new request will go to the new core, and the old core will be unloaded. So I've to wait for some time [say a couple of secs, may be less than that] before I start adding pages to that core. I think this is the way to handle it , otherwise some content which should have been indexed by the new core, will get indexed by the existing core[as the wiki says], which I don't want to happen. Any other ideas for handling the same. Thanks, KK. 2009/5/14 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com Solr already supports this . please refer this http://wiki.apache.org/solr/CoreAdmin#head-7ca1b98a9df8b8ca0dcfbfc49940ed5ac98c4a08 ensure that your solr.xml is persistent http://wiki.apache.org/solr/CoreAdmin#head-7508c24c6e2dadad2dfea39b2fba045062481da8 On Thu, May 14, 2009 at 3:43 PM, KK dioxide.softw...@gmail.com wrote: Thank you very much. Got the point. One off the track question, can we automate the creation of new cores[it requires manually editing the solr.xml file as I know, and what about the location of core index directory, do we need to point that manually as well]. After going through the wiki what I found is we've to mention the names of cores in solr.xml. I want to automate the process in such a way that when a user registers[ on say my site for the service], we'll create a coresponding core for the same user and with a specific core id[unique for this user only] so that the user will be given a search interface that will redirect all searches for this user to http://host:port/unique core name for this user/select Will apprecite any ideas on this. Thanks, KK. 2009/5/14 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com there is no hard limit on the no:of cores. it is limited by your system's ability to open files and the resources. the queries are automatically sent to appropriate core if your url is htt://host:port/corename/select On Thu, May 14, 2009 at 1:58 PM, KK dioxide.softw...@gmail.com wrote: I want to know the maximum no of cores supported by Solr. 1000s or may be millions all under one solr instance ? Also I want to know how to redirect a particular query to a particular core. Actually I'm querying solr from Ajax, so I think there must be some request parameter that says which core we want to query, right? Can some one tell me how to do this, any good pointers on the same will be helpful as well. Thank you. --kk -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: UK Solr users meeting?
I know of a few people who'd be interested, we've got quite a few projects using Solr down here in Brighton. On 14 May 2009, at 10:41, Fergus McMenemie wrote: I was wondering if there is an interest in a UK (South East) solr user group meeting Please let me know if you are interested. I am happy to organize. Regards, Colin Yes Very interested. I am in lincolnshire. -- === Fergus McMenemie Email:fer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer === Toby Cole Software Engineer Semantico Lees House, Floor 1, 21-23 Dyke Road, Brighton BN1 3FE T: +44 (0)1273 358 238 F: +44 (0)1273 723 232 E: toby.c...@semantico.com W: www.semantico.com
Re: Simplest way of registering new solr core!
On Mon, May 18, 2009 at 4:06 PM, KK dioxide.softw...@gmail.com wrote: Do we have to extend the CoreAdminHandler class and use some methods therein to register a new core? no. th create command will do that Thanks in appreciation. --KK On Mon, May 18, 2009 at 12:48 PM, KK dioxide.softw...@gmail.com wrote: Hi, What is the simplest way of registering a new solr core? Do we have to use some standard APIs for this, or making a Ajax get call to http://localhost:8983/solr/admin/cores with proper request parameter lik these, [?action=CREATEname=coreXinstanceDir=path_to_instance_directoryconfig=config_file_name.xmlschema=schem_file_name.xmldataDir=data] will do the job. I think this should work. Correct me if I'm wrong The process remains same for other requests as well[ like getting core status, reload, rename etc], right? Also, I would be thankful if someone can point me to some good tutorial on adding new cores to the existing ones, configuring multicore indexing etc. I don't find the solr wiki that useful for doing the same. Thanks, KK 2009/5/14 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com usually it happens fast . and the request returns only after the cor creation is completed. So if you are doing all the operations in the same thread no need to wait. On Thu, May 14, 2009 at 7:37 PM, KK dioxide.softw...@gmail.com wrote: Thank you very much. LOL, Its in the same wiki I was told to go through. I've a question regarding creating ofsolr cores on the fly. The wiki says, .Creates a new core and register it. If persistence is enabled (persist=true), the configuration for this new core will be saved in 'solr.xml'. If a core with the same name exists, while the new created core is initializing, the old one will continue to accept requests. Once it has finished, all new request will go to the new core, and the old core will be unloaded. So I've to wait for some time [say a couple of secs, may be less than that] before I start adding pages to that core. I think this is the way to handle it , otherwise some content which should have been indexed by the new core, will get indexed by the existing core[as the wiki says], which I don't want to happen. Any other ideas for handling the same. Thanks, KK. 2009/5/14 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com Solr already supports this . please refer this http://wiki.apache.org/solr/CoreAdmin#head-7ca1b98a9df8b8ca0dcfbfc49940ed5ac98c4a08 ensure that your solr.xml is persistent http://wiki.apache.org/solr/CoreAdmin#head-7508c24c6e2dadad2dfea39b2fba045062481da8 On Thu, May 14, 2009 at 3:43 PM, KK dioxide.softw...@gmail.com wrote: Thank you very much. Got the point. One off the track question, can we automate the creation of new cores[it requires manually editing the solr.xml file as I know, and what about the location of core index directory, do we need to point that manually as well]. After going through the wiki what I found is we've to mention the names of cores in solr.xml. I want to automate the process in such a way that when a user registers[ on say my site for the service], we'll create a coresponding core for the same user and with a specific core id[unique for this user only] so that the user will be given a search interface that will redirect all searches for this user to http://host:port/unique core name for this user/select Will apprecite any ideas on this. Thanks, KK. 2009/5/14 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com there is no hard limit on the no:of cores. it is limited by your system's ability to open files and the resources. the queries are automatically sent to appropriate core if your url is htt://host:port/corename/select On Thu, May 14, 2009 at 1:58 PM, KK dioxide.softw...@gmail.com wrote: I want to know the maximum no of cores supported by Solr. 1000s or may be millions all under one solr instance ? Also I want to know how to redirect a particular query to a particular core. Actually I'm querying solr from Ajax, so I think there must be some request parameter that says which core we want to query, right? Can some one tell me how to do this, any good pointers on the same will be helpful as well. Thank you. --kk -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Query Boost Functions
I have a field named last-modified that I like to use in bf (Boot Functions) parameter: recip(rord(last-modified),1,1000,1000) in DisMaxRequestHander. However the Solr query parser complain about the syntax of the formula. I think it is related with hyphen in the field name. I have tried to add single and double quote around the field name but didn't help. Can field name contain hyphen in boot functions? How to do it? If not, where do I find the field name special character restrictions? -Yao -- View this message in context: http://www.nabble.com/Query-Boost-Functions-tp23595860p23595860.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to update only few fields in a document
Hello, I went through the ticket but it seems that nothing is envisaged until version 1.5 Thanks for your answer !! Vincent Otis Gospodnetic wrote: Vincent, Unfortunately things haven't changed yet. If all your fields are stored, have a look at SOLR-139. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Vincent Pérès vincent.pe...@gmail.com To: solr-user@lucene.apache.org Sent: Friday, May 15, 2009 9:45:06 AM Subject: How to update only few fields in a document Hello, I did just find only post about updating document, maybe things evolved since that time. I need to update a field in few thousand documents in one time (or multiple request), but I wouldn't like to have to add a new document instead of the current one (I mean it's how it works if I well understand). Example : curl http://localhost:8982/solr/update --data-binary ' name=id15 name=id22' -H 'Content-type:text/xml; charset=utf-8' This request will replace the current documents by two new one. The documents contains big text parts and I wouldn't have to send them every time. Is there any feature which could allow me to do that? Thanks ! Vincent -- View this message in context: http://www.nabble.com/How-to-update-only-few-fields-in-a-document-tp23560169p23560169.html Sent from the Solr - User mailing list archive at Nabble.com. -- View this message in context: http://www.nabble.com/How-to-update-only-few-fields-in-a-document-tp23560169p23595879.html Sent from the Solr - User mailing list archive at Nabble.com.
How to gather fields when faceting results ?
Hello, Using faceting in my results, I would like to link two fields : For example, a parentID field should help me retrieve all the different parents ids from a query, but it can't be used for displaying the facets, as it doesn't mean anything for a user. A second field, parentTitle, in my document is a human readable transcription of this field : it is not usable as an identifiant because several parentID can use the same parentTitle. When parentID is used in faceting my results, I would like to retrieve the parentTitle as well, and sort of be able to link the two of them, so the later could be displayed instead of the former... Is there a way to accomplish that ? Thanks, King regards, Pierre-Yves _ More than messages–check out the rest of the Windows Live™. http://www.microsoft.com/windows/windowslive/
Re: Howto? Obtain the IndexReader from within a solr filter
Does anyone know if unsubscibe works for this mail list. I don't seem to be able to unsubscribe On May 18, 2009, at 9:12 AM, Noble Paul നോബിള് नो ब्ळ् noble.p...@corp.aol.com wrote: I am writing a solr filter what is a solr filter? On Mon, May 18, 2009 at 2:18 PM, Yatir yat...@outbrain.com wrote: Hi, I am writing a solr filter that needs the DocFreq of each Token in order to decide what to do with it. What is the easiest way to obtain this information from within the filter code ? thanks, Yatir -- View this message in context: http://www.nabble.com/Howto--Obtain-the-IndexReader-from-within-a-solr-filter-tp23593475p23593475.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Solr Document Sort
Hi, In the Solr schema file I have a integer type field named as 'ContentType' as follows field name=ContentType type=int indexed=true stored=true/ The values of this field can be one of the following: 1(for News) , 2(for Reviews), 3(for Opinion), 4(for Blogs) I have a scenario in which when a user does a search the result should be sorted by Best match(i.e. from Highest relevancy score to lowest score). At the same time I want the solr documents having Blogs as the value in the ContentType field appear at the bottom in the search result below the documents having News, Reviews or Opinion as the value in the ContentType. The way I am doing this is by first doing a sort on the ContentType Field and then doing a sort by score as follows. sort=ContentType asc,score desc Is there a better solution to do the same. Thanks Gurjot
Re: Sole core naming convention for multicores
KK - In my experience with multi-core, I've found that using the user record's integer PK for each user core works well by still allowing the user to update their email addresses / usernames over time. cheers, --bemansell On May 17, 2009 10:39 PM, KK dioxide.softw...@gmail.com wrote: Thank you Otis. One silly question, how would I know that a particular character is forbidden, I think Solr will give me exceptions saying that some characters not allowed, right? Thank, KK. On Sun, May 17, 2009 at 3:12 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: KK, ...
Re: Solr Document Sort
Gurjot - Take a look at the Solr Relevancy Cookbook http://wiki.apache.org/solr/SolrRelevancyCookbook - It provides some good guidelines for boosting term ranking. cheers, --bemansell On Mon, May 18, 2009 at 7:29 AM, Gurjot Singh gurjotas...@gmail.com wrote: Hi, In the Solr schema file I have a integer type field named as 'ContentType' as follows field name=ContentType type=int indexed=true stored=true/ The values of this field can be one of the following: 1(for News) , 2(for Reviews), 3(for Opinion), 4(for Blogs) I have a scenario in which when a user does a search the result should be sorted by Best match(i.e. from Highest relevancy score to lowest score). At the same time I want the solr documents having Blogs as the value in the ContentType field appear at the bottom in the search result below the documents having News, Reviews or Opinion as the value in the ContentType. The way I am doing this is by first doing a sort on the ContentType Field and then doing a sort by score as follows. sort=ContentType asc,score desc Is there a better solution to do the same. Thanks Gurjot
Re: Query Boost Functions
Yao Ge wrote: I have a field named last-modified that I like to use in bf (Boot Functions) parameter: recip(rord(last-modified),1,1000,1000) in DisMaxRequestHander. However the Solr query parser complain about the syntax of the formula. I think it is related with hyphen in the field name. I have tried to add single and double quote around the field name but didn't help. Can field name contain hyphen in boot functions? How to do it? If not, where do I find the field name special character restrictions? -Yao Hmm, this seems to be a bug. Can you open a JIRA issue? Meanwhile, you can use . or _ instead of -. Koji
Re: multicore for 20k users?
2009/5/17 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com: A few questions, 1) what is the frequency of inserts? A few per day per user at MOST. 2) how many cores need to be up and running at any given point That depends on the people. I would love to be able to tie it to their webapp session, maybe 100 at once? No idea, really. Thankyou, Chris On Mon, May 18, 2009 at 3:23 AM, Chris Cornell srchn...@gmail.com wrote: Trying to create a search solution for about 20k users at a company. Each person's documents are private and different (some overlap... it would be nice to not have to store/index copies). Is multicore something that would work or should we auto-insert a facet into each query generated by the person? Thanks for any advice, I am very new to solr. Any tiny push in the right direction would be appreciated. Thanks, Chris -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Date field
My bad it was an id10t error. On Fri, May 15, 2009 at 8:21 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Jack, Which bug are you referring to? Last time I played with function queries with date fields things worked as expected. If there is/was a known bug, it must be in JIRA... Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Jack Godwin god...@gmail.com To: solr-user@lucene.apache.org Sent: Thursday, May 14, 2009 8:09:13 AM Subject: Date field Does anyone know if there is still a bug in date fields? I'm having a problem boosting documents by date in solr 1.3 Thank, Jack -- Sent from my mobile device
Re: How to gather fields when faceting results ?
How about having a single facetable field with values parentId_parentTitle? Get rid of the parentId and the underscore as a post process. Cheers Avlesh On Mon, May 18, 2009 at 7:13 PM, Pierre-Yves LANDRON pland...@hotmail.comwrote: Hello, Using faceting in my results, I would like to link two fields : For example, a parentID field should help me retrieve all the different parents ids from a query, but it can't be used for displaying the facets, as it doesn't mean anything for a user. A second field, parentTitle, in my document is a human readable transcription of this field : it is not usable as an identifiant because several parentID can use the same parentTitle. When parentID is used in faceting my results, I would like to retrieve the parentTitle as well, and sort of be able to link the two of them, so the later could be displayed instead of the former... Is there a way to accomplish that ? Thanks, King regards, Pierre-Yves _ More than messages–check out the rest of the Windows Live™. http://www.microsoft.com/windows/windowslive/
Re: Solr Shard - Strange results
I'm not quite sure what logs you are talking about, but in the tomcat/logs/catalina.out logs, i found the following [note, i can't copy/paste, so i am typing up a summary]: I execute command: localhost:8080/bravo/select?q=fredrows=102start=0shards=localhost:8080/alpha,localhost:8080/bravo In this example, alpha has 27 instances of fred, while bravo has 0. Then in the catalina.out: -There is the request for the command i sent, shards parameters and all. it has the proper queryString. -Then I see the two requests sent to the shards, apha and bravo. These two requests weave between each other until they are finished: INFO: REQUEST URI =/alpha/select INFO: REQUEST URI =/bravo/select The parameters have changed to: wt=javabinfsv=trueversion=2.2f1=docNumber,scoreq=fredrows=102isShard=truestart=0 -Then 2 INFO's scroll across: INFO: [] webapp=/bravo path=/select params={wt=javabinfsv=trueversion=2.2f1=docNumber,scoreq=fredrows=102isShard=truestart=0} hits=0 status=0 QTime=1 INFO: [] webapp=/alpha path=/select params={wt=javabinfsv=trueversion=2.2f1=docNumber,scoreq=fredrows=102isShard=truestart=0} hits=27 status=0 QTime=1 **Note, hits=27 -Then i see some octet-streams being transferred, with status 200, so those are OK. -The i see something peculiar: It calls alpha with the following parameters: wt=javabinversion=2.2ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55q=fredrows=102parameter=isShard=truestart=0 Performing this query on my own (without the wt=javabin) gives me numFound=2, the result-set I get back from the overarching query. Changing it to rows=10, it gives me numFound=2, and 2 doc's. This is not the strange functionality I was seeing with the overarching query and the mis-matched numfound and doc's. This does beg the question.. why did it add: ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55 to the query? They are the format that would be under docNumber, if that helps.. Any thoughts? I will do some research on those particular ID numbered docs, in the mean time. Here's the configuration information. I only posted the difference from the default files in the solr/example/solr/conf [solrconfig.xml] config dataDir${solr.data.dir:/data/indices/bravo/solr/data/dataDir requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=config/data/indices/bravo/solr/conf/data-config.xml/str /lst /requestHandler config [schema.xml] schema fields field name=docNumber type=text indexed=true stored=true / field name=column1 type=text indexed=true stored=true / field name=column2 type=text indexed=true stored=true / field name=column3 type=text indexed=true stored=true / field name=column4 type=text indexed=true stored=true / field name=column5 type=text indexed=true stored=true / field name=column6 type=text indexed=true stored=true / field name=column7 type=text indexed=true stored=true / field name=column8 type=text indexed=true stored=true / field name=column9 type=text indexed=true stored=true / /fields uniqueKeydocNumber/uniqueKey defaultSearchFieldcolumn2/defaultSearchField /schema [data-config.xml] dataConfig dataSource type=JdbcDataSource driver=com.metamatrix.jdbc.MMDriver url=jdbc:metamatrix:b...@mms://hostname:port user=username password=password/ document naame=DOC_NAME entity name=ENT_NAME query=select * from ASDF.TABLE field column=TABLE_COL_NO name=docNumber / field column=TABLE_COL_1 name=column1 / field column=TABLE_COL_2 name=column2 / field column=TABLE_COL_3 name=column3 / field column=TABLE_COL_4 name=column4 / field column=TABLE_COL_5 name=column5 / field column=TABLE_COL_6 name=column6 / field column=TABLE_COL_7 name=column7 / field column=TABLE_COL_8 name=column8 / field column=TABLE_COL_9 name=column9 / /entity /document /dataConfig Yonik Seeley-2 wrote: On Fri, May 15, 2009 at 4:11 PM, CB-PO charles.bush...@gmail.com wrote: Yeah, the first thing I thought of was that perhaps there was something wrong with the uniqueKey and they were clashing between the indexes, however upon visual inspection of the data the field we are using as the unique key in each of the indexes is grossly different between the two databases, so there is no chance of them clashing. Yes, but is the same fieldname and FieldType used for both indexes? (that's sort of a requirement) You might also
Re: Query with AND|OR operator with Dismaxrequest
Prerna, Yes, DisMax doesn't take in queries with Boolean operators. But I believe there is a patch in JIRA that makes that possible. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: prerna07 pkhandelw...@sapient.com To: solr-user@lucene.apache.org Sent: Monday, May 18, 2009 6:05:50 AM Subject: Query with AND|OR operator with Dismaxrequest Hi, I am not getting correct results with a Query which has multiple AND | OR operator. Query Format q=((A AND B) OR (C OR D) OR E) ?q=((intAgeFrom_product_i:[0+TO+3]+AND+intAgeTo_product_i:[3+TO+*])+OR+(intAgeFrom_product_i:[0+TO+3]+AND+intAgeTo_product_i:[0+TO+3])+OR+(ageFrom_product_s:Adult))qt=dismaxrequest Query return correct result without Dismaxrequest, but incorrect results with Dismaxrequest. I have to use dismaxrequest because i need boosting of search results According to some posts there are issues with AND | OR operator with dismaxrequest. Please let me know if anyone has faced the same problem and if there is any way to make the query work with dismaxrequest. Thanks, Prerna -- View this message in context: http://www.nabble.com/Query-with-AND%7COR-operator-with-Dismaxrequest-tp23594592p23594592.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Shard - Strange results
Maybe you want to try with docNumber field type as string and see it would make a difference. CB-PO wrote: I'm not quite sure what logs you are talking about, but in the tomcat/logs/catalina.out logs, i found the following [note, i can't copy/paste, so i am typing up a summary]: I execute command: localhost:8080/bravo/select?q=fredrows=102start=0shards=localhost:8080/alpha,localhost:8080/bravo In this example, alpha has 27 instances of fred, while bravo has 0. Then in the catalina.out: -There is the request for the command i sent, shards parameters and all. it has the proper queryString. -Then I see the two requests sent to the shards, apha and bravo. These two requests weave between each other until they are finished: INFO: REQUEST URI =/alpha/select INFO: REQUEST URI =/bravo/select The parameters have changed to: wt=javabinfsv=trueversion=2.2f1=docNumber,scoreq=fredrows=102isShard=truestart=0 -Then 2 INFO's scroll across: INFO: [] webapp=/bravo path=/select params={wt=javabinfsv=trueversion=2.2f1=docNumber,scoreq=fredrows=102isShard=truestart=0} hits=0 status=0 QTime=1 INFO: [] webapp=/alpha path=/select params={wt=javabinfsv=trueversion=2.2f1=docNumber,scoreq=fredrows=102isShard=truestart=0} hits=27 status=0 QTime=1 **Note, hits=27 -Then i see some octet-streams being transferred, with status 200, so those are OK. -The i see something peculiar: It calls alpha with the following parameters: wt=javabinversion=2.2ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55q=fredrows=102parameter=isShard=truestart=0 Performing this query on my own (without the wt=javabin) gives me numFound=2, the result-set I get back from the overarching query. Changing it to rows=10, it gives me numFound=2, and 2 doc's. This is not the strange functionality I was seeing with the overarching query and the mis-matched numfound and doc's. This does beg the question.. why did it add: ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55 to the query? They are the format that would be under docNumber, if that helps.. Any thoughts? I will do some research on those particular ID numbered docs, in the mean time. Here's the configuration information. I only posted the difference from the default files in the solr/example/solr/conf [solrconfig.xml] config dataDir${solr.data.dir:/data/indices/bravo/solr/data/dataDir requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=config/data/indices/bravo/solr/conf/data-config.xml/str /lst /requestHandler config [schema.xml] schema fields field name=docNumber type=text indexed=true stored=true / field name=column1 type=text indexed=true stored=true / field name=column2 type=text indexed=true stored=true / field name=column3 type=text indexed=true stored=true / field name=column4 type=text indexed=true stored=true / field name=column5 type=text indexed=true stored=true / field name=column6 type=text indexed=true stored=true / field name=column7 type=text indexed=true stored=true / field name=column8 type=text indexed=true stored=true / field name=column9 type=text indexed=true stored=true / /fields uniqueKeydocNumber/uniqueKey defaultSearchFieldcolumn2/defaultSearchField /schema [data-config.xml] dataConfig dataSource type=JdbcDataSource driver=com.metamatrix.jdbc.MMDriver url=jdbc:metamatrix:b...@mms://hostname:port user=username password=password/ document naame=DOC_NAME entity name=ENT_NAME query=select * from ASDF.TABLE field column=TABLE_COL_NO name=docNumber / field column=TABLE_COL_1 name=column1 / field column=TABLE_COL_2 name=column2 / field column=TABLE_COL_3 name=column3 / field column=TABLE_COL_4 name=column4 / field column=TABLE_COL_5 name=column5 / field column=TABLE_COL_6 name=column6 / field column=TABLE_COL_7 name=column7 / field column=TABLE_COL_8 name=column8 / field column=TABLE_COL_9 name=column9 / /entity /document /dataConfig Yonik Seeley-2 wrote: On Fri, May 15, 2009 at 4:11 PM, CB-PO charles.bush...@gmail.com wrote: Yeah, the first thing I thought of was that perhaps there was something wrong with the uniqueKey and they were clashing between the indexes, however upon visual inspection of the data the field we are using as the unique key in each of the indexes is grossly different between the two databases, so
Re: Indexing issue in DIH - not all records are Indexed
Hi Noble, Many thanks for the reply Yes there is a UniqueKey in the Schema which is the ProductID. I also tried uniqueKey required=falsePROD_ID/uniqueKey. But no luck same only one document seen after querying *:* I have attached the Schema.xml used for your reference,please advise. Thanks and regards, Jay 2009/5/16 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com check out if you have a uniqueKey in your schema. I there are duplicates they are overwritten On Sat, May 16, 2009 at 1:38 AM, jayakeerthi s mail2keer...@gmail.com wrote: I am using Solr for our application with JBoss Integration. I have managed to configure the indexing from Oralce db for 22 fields.Here is the db-data-config.xml dataConfig dataSource type=JdbcDataSource driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:thin:@camatld6.***.com:1521:atlasint user=service_product_lgd password=/ document name=products entity name=PROD transformer=RegexTransformer query=SELECT A.PROD_ID,A.PROD_CD,C.REG_CMRC_STYL_NM,C.SAP_LANG_ID,A.DIV_ID ,c.SIZE_RUN_DESC, c.INSM_DESC, c.OTSM_DESC, c.DIM_DESC, c.PRFL_DESC,c.UPR_DESC,c.MDSL_DESC,c.OUTSL_DESC,c.CTNT_DESC, D.SPORT_ACTY_DESC, E.GNDR_AGE_DESC, A.PO_GRID_DESC,A.COLR_DISP_CD, B.STYL_CD , A.SILO_ID, A.SILH_ID, F.SILH_DESC, g.SILO_DESC , h.FRST_PROD_OFFR_DT, h.END_FTR_OFFR_DT, h.RETL_PR_AMT,h.RETL_CRCY_ID,h.WHSLE_PR_AMT,h.WHSLE_CRCY_ID,I.ORG_LGCY_DIV_CD from PROD A ,PROD_STYL B ,PROD_REG_CMRC_STYL C , PROD_SPORT_ACTY D , PROD_GNDR_AGE E , PROD_SILH F, PROD_SILO G, PROD_REG H, ORG_DIV I WHERE A.PROD_STYL_ID=B.PROD_STYL_ID AND A.PROD_STYL_ID = c.PROD_STYL_ID AND B.PROD_STYL_ID = C.PROD_STYL_ID AND A.SPORT_ACTY_ID = d.SPORT_ACTY_ID AND A.GNDR_AGE_ID = E.GNDR_AGE_ID and A.SILH_ID = F.SILH_ID AND A.SILO_ID = G.SILO_ID AND A.PROD_ID = H.PROD_ID AND A.DIV_ID = I.DIV_ID /entity /document /dataConfig And I have attached the Schema.xml used.done a full-import http://localhost:8983/solr/dataimport?command=full-import response - lst name=responseHeader int name=status0/int int name=QTime0/int /lst - lst name=initArgs - lst name=defaults - str name=config C:\apache-solr-nightly\example\example-DIH\solr\db\conf\db-data-config.xml /str /lst /lst str name=commandfull-import/str str name=statusidle/str str name=importResponse/ - lst name=statusMessages str name=Total Requests made to DataSource1/str str name=Total Rows Fetched15/str str name=Total Documents Skipped0/str str name=Full Dump Started2009-05-11 11:27:02/str - str name= Indexing completed. Added/Updated: 15 documents. Deleted 0 documents. /str str name=Committed2009-05-11 11:27:05/str str name=Optimized2009-05-11 11:27:05/str str name=Time taken 0:0:2.625/str /lst - str name=WARNING This response format is experimental. It is likely to change in the future. /str /response The issue I am facing is:though the response is Indexing completed. Added/Updated: 15 documents. Deleted 0 documents I am able to seee only one document when I query *:* so all the other 14 documents are missing. Similarly I tried indexing 1 million records and found only 2500 docs by using *:* query So could anyone please help resolving this. Regards, Jay -- - Noble Paul | Principal Engineer| AOL | http://aol.com ?xml version=1.0 encoding=UTF-8 ? !-- Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the License); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. -- !-- This is the Solr schema file. This file should be named schema.xml and should be in the conf directory under the solr home (i.e. ./solr/conf/schema.xml by default) or located where the classloader for the Solr webapp can find it. This example schema is the recommended starting point for users. It should be kept correct and concise, usable out-of-the-box. For more information, on how to customize this file, please see http://wiki.apache.org/solr/SchemaXml NOTE: this schema includes many optional features and should not be used for benchmarking. -- schema name=example version=1.2 !-- attribute name is the name of this schema and is only used for display
Re: Howto? Obtain the IndexReader from within a solr filter
On May 18, 2009, at 10:16 AM, A. Banji Oyebisi wrote: Does anyone know if unsubscibe works for this mail list. I don't seem to be able to unsubscribe It should work for you to unsubscribe by sending a mail to solr-user-unsubscr...@lucene.apache.org from your subscribed address, and you will then need to reply to the confirmation it sends. If that doesn't work, e-mail me with the address to be unsubscribed and I can do it. Erik
RE: Incorrect sort with with function query in query parameters
A Unit test would be ideal, but even if you can just provide a list of steps (ie: using this solrconfig+schema, index these docs, then update this one doc, then execute this search) it can help people track things down. Please open a bug and attach as much detail as you can there. -Hoss Was a bug ever opened on this? I am seeing similar behavior (though in my case it's the debug scores that look wrong). -Ken
Re: Solr Shard - Strange results
I'm not quite sure how that would make a difference... From my most recent testing, it seems that the problem is related to the Shards element adding ids=[...] to one of the queries. However, I will give it a try. Yao Ge wrote: Maybe you want to try with docNumber field type as string and see it would make a difference. CB-PO wrote: I'm not quite sure what logs you are talking about, but in the tomcat/logs/catalina.out logs, i found the following [note, i can't copy/paste, so i am typing up a summary]: I execute command: localhost:8080/bravo/select?q=fredrows=102start=0shards=localhost:8080/alpha,localhost:8080/bravo In this example, alpha has 27 instances of fred, while bravo has 0. Then in the catalina.out: -There is the request for the command i sent, shards parameters and all. it has the proper queryString. -Then I see the two requests sent to the shards, apha and bravo. These two requests weave between each other until they are finished: INFO: REQUEST URI =/alpha/select INFO: REQUEST URI =/bravo/select The parameters have changed to: wt=javabinfsv=trueversion=2.2f1=docNumber,scoreq=fredrows=102isShard=truestart=0 -Then 2 INFO's scroll across: INFO: [] webapp=/bravo path=/select params={wt=javabinfsv=trueversion=2.2f1=docNumber,scoreq=fredrows=102isShard=truestart=0} hits=0 status=0 QTime=1 INFO: [] webapp=/alpha path=/select params={wt=javabinfsv=trueversion=2.2f1=docNumber,scoreq=fredrows=102isShard=truestart=0} hits=27 status=0 QTime=1 **Note, hits=27 -Then i see some octet-streams being transferred, with status 200, so those are OK. -The i see something peculiar: It calls alpha with the following parameters: wt=javabinversion=2.2ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55q=fredrows=102parameter=isShard=truestart=0 Performing this query on my own (without the wt=javabin) gives me numFound=2, the result-set I get back from the overarching query. Changing it to rows=10, it gives me numFound=2, and 2 doc's. This is not the strange functionality I was seeing with the overarching query and the mis-matched numfound and doc's. This does beg the question.. why did it add: ids=ABC-1353,ABC-408,ABC-1355,ABC-1824,ABC-1354,FRED-ID-27,55 to the query? They are the format that would be under docNumber, if that helps.. Any thoughts? I will do some research on those particular ID numbered docs, in the mean time. Here's the configuration information. I only posted the difference from the default files in the solr/example/solr/conf [solrconfig.xml] config dataDir${solr.data.dir:/data/indices/bravo/solr/data/dataDir requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=config/data/indices/bravo/solr/conf/data-config.xml/str /lst /requestHandler config [schema.xml] schema fields field name=docNumber type=text indexed=true stored=true / field name=column1 type=text indexed=true stored=true / field name=column2 type=text indexed=true stored=true / field name=column3 type=text indexed=true stored=true / field name=column4 type=text indexed=true stored=true / field name=column5 type=text indexed=true stored=true / field name=column6 type=text indexed=true stored=true / field name=column7 type=text indexed=true stored=true / field name=column8 type=text indexed=true stored=true / field name=column9 type=text indexed=true stored=true / /fields uniqueKeydocNumber/uniqueKey defaultSearchFieldcolumn2/defaultSearchField /schema [data-config.xml] dataConfig dataSource type=JdbcDataSource driver=com.metamatrix.jdbc.MMDriver url=jdbc:metamatrix:b...@mms://hostname:port user=username password=password/ document naame=DOC_NAME entity name=ENT_NAME query=select * from ASDF.TABLE field column=TABLE_COL_NO name=docNumber / field column=TABLE_COL_1 name=column1 / field column=TABLE_COL_2 name=column2 / field column=TABLE_COL_3 name=column3 / field column=TABLE_COL_4 name=column4 / field column=TABLE_COL_5 name=column5 / field column=TABLE_COL_6 name=column6 / field column=TABLE_COL_7 name=column7 / field column=TABLE_COL_8 name=column8 / field column=TABLE_COL_9 name=column9 / /entity /document /dataConfig Yonik Seeley-2 wrote: On Fri, May 15, 2009 at 4:11 PM, CB-PO charles.bush...@gmail.com wrote: Yeah, the first thing I thought of was that perhaps there was something wrong with the uniqueKey and
Re: multicore for 20k users?
since there is so little overlap, I would look at a core for each user... However, to manage 20K cores, you will not want to use the off the shelf core management implementation to maintain these cores. Consider overriding SolrDispatchFilter to initialize a CoreContainer that you manage. On May 17, 2009, at 10:11 PM, Chris Cornell wrote: On Sun, May 17, 2009 at 8:38 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Chris, Yes, disk space is cheap, and with so little overlap you won't gain much by putting everything in a single index. Plus, when each user has a separate index, it's easy to to split users and distribute over multiple machines if you ever need to do that, it's easy and fast to completely reindex one user's data without affecting other users, etc. Several years ago I built Simpy at http://www.simpy.com/ that way (but pre-Solr, so it uses Lucene directly) and never regretted it. There are way more than 20K users there with many searches per second and with constant indexing. Each user has an index for bookmarks and an index for notes. Each group has its own index, shared by all group members. The main bookmark search is another index. People search is yet another index. And so on. Single server. Thankyou very much for your insight and experience, sounds like we shouldn't be thinking about prematurely optimizing this. Has someone actually used multicore this way, though? With thousands of them? Independently of advice in that regard, I guess our next step is to explore and create some dummy scenarios/tests to try and stress multicore (search latency is not as much of a factor as memory usage is). I'll report back on any conclusion we come to. Thanks! Chris
Re: UK Solr users meeting?
+1 vote here. We are based in London. Regards Waseem On Mon, May 18, 2009 at 11:42 AM, Toby Cole toby.c...@semantico.com wrote: I know of a few people who'd be interested, we've got quite a few projects using Solr down here in Brighton. On 14 May 2009, at 10:41, Fergus McMenemie wrote: I was wondering if there is an interest in a UK (South East) solr user group meeting Please let me know if you are interested. I am happy to organize. Regards, Colin Yes Very interested. I am in lincolnshire. -- === Fergus McMenemie Email:fer...@twig.me.ukemail%3afer...@twig.me.uk Techmore Ltd Phone:(UK) 07721 376021 Unix/Mac/Intranets Analyst Programmer === Toby Cole Software Engineer Semantico Lees House, Floor 1, 21-23 Dyke Road, Brighton BN1 3FE T: +44 (0)1273 358 238 F: +44 (0)1273 723 232 E: toby.c...@semantico.com W: www.semantico.com
Re: query regarding Indexing xml files -db-data-config.xml
Hi Noble, Thanks for the reply, As advised I have changed the db-data-config.xml as below. But still the str name=Indexing completed. Added/Updated: 0 documents. Deleted 0 documents./str dataConfig dataSource type=FileDataSource name =xmlindex/ document name=products entity name=xmlfile processor=FileListEntityProcessor fileName=c:\\test\\ipod_other.xml recursive=true rootEntity=false dataSource=null baseDir=${dataimporter.request.xmlDataDir} useSolrAddSchema=true entity name=data processor=XPathEntityProcessor url=${xmlfile.fileAbsolutePath} field column=manu name=manu/ /entity /entity /document /dataConfig Got error as below when baseDir is removed INFO: last commit = 1242683454570 May 18, 2009 2:55:15 PM org.apache.solr.handler.dataimport.DataImporter doFullImport SEVERE: Full Import failed org.apache.solr.handler.dataimport.DataImportHandlerException: 'baseDir' is a required attribute Pro cessing Document # 1 at org.apache.solr.handler.dataimport.FileListEntityProcessor.init(FileListEntityProcessor.j ava:76) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:299) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:225) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:324) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:382) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:363) May 18, 2009 2:55:15 PM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: start rollback Please advise. Thanks and regards, Jay 2009/5/17 Noble Paul നോബിള് नोब्ळ् noble.p...@corp.aol.com hi , u may not need that enclosing entity , if you only wish to index one file. baseDir is not required if you give absolute path in the fileName. no need to mention forEach or fields if you set useSolrAddSchema=true On Sat, May 16, 2009 at 1:23 AM, jayakeerthi s mail2keer...@gmail.com wrote: Hi All, I am trying to index the fileds from the xml files, here is the configuration that I am using. db-data-config.xml dataConfig dataSource type=FileDataSource name =xmlindex/ document name=products entity name=xmlfile processor=FileListEntityProcessor fileName=c:\test\ipod_other.xml recursive=true rootEntity=false dataSource=null baseDir=${dataimporter.request.xmlDataDir} entity name=data processor=XPathEntityProcessor forEach=/record | /the/record/xpath url=${xmlfile.fileAbsolutePath} field column=manu name=manu/ /entity /entity /document /dataConfig Schema.xml has the field manu The input xml file used to import the field is doc field name=idF8V7067-APL-KIT/field field name=nameBelkin Mobile Power Cord for iPod w/ Dock/field field name=manuBelkin/field field name=catelectronics/field field name=catconnector/field field name=featurescar power adapter, white/field field name=weight4/field field name=price19.95/field field name=popularity1/field field name=inStockfalse/field /doc doing the full-import this is the response I am getting - lst name=statusMessages str name=Total Requests made to DataSource0/str str name=Total Rows Fetched0/str str name=Total Documents Skipped0/str str name=Full Dump Started2009-05-15 11:58:00/str str name=Indexing completed. Added/Updated: 0 documents. Deleted 0 documents./str str name=Committed2009-05-15 11:58:00/str str name=Optimized2009-05-15 11:58:00/str str name=Time taken0:0:0.172/str /lst str name=WARNINGThis response format is experimental. It is likely to change in the future./str /response Do I missing anything here or is there any format on the input xml,?? please help resolving this. Thanks and regards, Jay -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Incorrect sort with with function query in query parameters
I have been intending to although I have been dragging my feet on it. I've never opened a bug before so I'm not sure of the protocol. If you don't mind, it would be great if you could send me a pm and point me in the right direction. Thanks, Asif On Mon, May 18, 2009 at 7:30 PM, Ensdorf Ken ensd...@zoominfo.com wrote: A Unit test would be ideal, but even if you can just provide a list of steps (ie: using this solrconfig+schema, index these docs, then update this one doc, then execute this search) it can help people track things down. Please open a bug and attach as much detail as you can there. -Hoss Was a bug ever opened on this? I am seeing similar behavior (though in my case it's the debug scores that look wrong). -Ken
Re: Howto? Applying a filter across schema fileds using state information
I needed to do something like this recently as well. I needed to copy a date field (with full precision to the millisecond) to a string field of just MMDD. I didn't see a way to do it in solr core. I ended up doing it in the Data Import Handler during import. I'd rather have code like that in the core someplace in case documents are added via some other mechanism. -Bryan On May 18, 2009, at May 18, 1:44 AM, Yatir wrote: Hi, I need to write a filter that extracts information from the content of one filed (say the Body field) and then applies some transformation based on this content, to a *different* filed (say: the Title field) is this possible ? Example: I will find certain keywords in the body and then locate them and transform them in the title -- View this message in context: http://www.nabble.com/Howto--Applying-a-filter-across-schema-fileds-using-state-information-tp23593424p23593424.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Query Boost Functions
On Mon, May 18, 2009 at 11:12 AM, Koji Sekiguchi k...@r.email.ne.jp wrote: Yao Ge wrote: I have a field named last-modified that I like to use in bf (Boot Functions) parameter: recip(rord(last-modified),1,1000,1000) in DisMaxRequestHander. However the Solr query parser complain about the syntax of the formula. I think it is related with hyphen in the field name. I have tried to add single and double quote around the field name but didn't help. Can field name contain hyphen in boot functions? How to do it? If not, where do I find the field name special character restrictions? -Yao Hmm, this seems to be a bug. Can you open a JIRA issue? Meanwhile, you can use . or _ instead of -. I regret not being more strict on fieldnames earlier on... I think best practice should be to limit Solr fieldnames to valid java identifiers... you're going to be much more future-proof that way (think about a future infix function query parser, nice client mappings for field names, etc). -Yonik http://www.lucidimagination.com
Re: Incorrect sort with with function query in query parameters
This was fixed April 24th https://issues.apache.org/jira/browse/SOLR-1124 explained further in https://issues.apache.org/jira/browse/SOLR- -Yonik http://www.lucidimagination.com On Thu, Mar 26, 2009 at 10:24 PM, Asif Rahman a...@newscred.com wrote: Hi all, I'm having an issue with the order of my results when attempting to sort by a function in my query. Looking at the debug output of the query, the score returned with in the result section for any given document does not match the score in the debug output. It turns out that if I optimize the index, then the results are sorted correctly. The scores in the debug output are the correct scores. This behavior only occurs using a recent nightly build of Solr. It works correctly in Solr 1.3. An example query is: http://localhost:8080/solr/core-01/select?qt=standardfl=*,scorerows=10q=*:*%20_val_:recip(rord(article_published_at),1,1000,1000)^1debugQuery=on I've attached the result to this email. Can anybody shed any light on this problem? Thanks, Asif http://www.nabble.com/file/p22735009/result.xml result.xml -- View this message in context: http://www.nabble.com/Incorrect-sort-with-with-function-query-in-query-parameters-tp22735009p22735009.html Sent from the Solr - User mailing list archive at Nabble.com.
DataImportHandler Template Transformer
It took me a while to understand that to use the Template Transfomer (http://lucene.apache.org/solr/api/org/apache/solr/handler/dataimport/TemplateTransformer.html), all building variable names (e.g. ${e.firstName} ${e.lastName} etc). can not contain null values. I hope the parser can do a better job explaining it. Also it will be nice to simple pad the null value will blank string. Should this be considered as an enhancement? -- View this message in context: http://www.nabble.com/DataImportHandler-Template-Transformer-tp23609267p23609267.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Sole core naming convention for multicores
I think I should also do the same way. Thanks Brain for pointing me to this idea. As per my db design I don't have any single key as PK, I'm thinking of putting a new field called [say] coreId and make it PK with auto-increment option[I'm using MySQL, btw], and this will solve the problem. I think this is simple yet elegant way to fix the problem. Thanks, KK On Mon, May 18, 2009 at 8:14 PM, Brian Mansell lifeofbr...@gmail.comwrote: KK - In my experience with multi-core, I've found that using the user record's integer PK for each user core works well by still allowing the user to update their email addresses / usernames over time. cheers, --bemansell On May 17, 2009 10:39 PM, KK dioxide.softw...@gmail.com wrote: Thank you Otis. One silly question, how would I know that a particular character is forbidden, I think Solr will give me exceptions saying that some characters not allowed, right? Thank, KK. On Sun, May 17, 2009 at 3:12 AM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: KK, ...
Re: Howto? Obtain the IndexReader from within a solr filter
Thanks for Hijacking my thread! lerosky wrote: Does anyone know if unsubscibe works for this mail list. I don't seem to be able to unsubscribe On May 18, 2009, at 9:12 AM, Noble Paul നോബിള് नो ब्ळ् noble.p...@corp.aol.com wrote: I am writing a solr filter what is a solr filter? On Mon, May 18, 2009 at 2:18 PM, Yatir yat...@outbrain.com wrote: Hi, I am writing a solr filter that needs the DocFreq of each Token in order to decide what to do with it. What is the easiest way to obtain this information from within the filter code ? thanks, Yatir -- View this message in context: http://www.nabble.com/Howto--Obtain-the-IndexReader-from-within-a-solr-filter-tp23593475p23593475.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- View this message in context: http://www.nabble.com/Howto--Obtain-the-IndexReader-from-within-a-solr-filter-tp23593475p23610189.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Howto? Obtain the IndexReader from within a solr filter
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters Noble Paul നോബിള് नोब्ळ्-2 wrote: I am writing a solr filter what is a solr filter? On Mon, May 18, 2009 at 2:18 PM, Yatir yat...@outbrain.com wrote: Hi, I am writing a solr filter that needs the DocFreq of each Token in order to decide what to do with it. What is the easiest way to obtain this information from within the filter code ? thanks, Yatir -- View this message in context: http://www.nabble.com/Howto--Obtain-the-IndexReader-from-within-a-solr-filter-tp23593475p23593475.html Sent from the Solr - User mailing list archive at Nabble.com. -- - Noble Paul | Principal Engineer| AOL | http://aol.com -- View this message in context: http://www.nabble.com/Howto--Obtain-the-IndexReader-from-within-a-solr-filter-tp23593475p23610196.html Sent from the Solr - User mailing list archive at Nabble.com.