Re: solr plugins
: I updated with a patch. Is it possible to get this in soon cuz I : have a client waiting on this. I've posted some comments about your patch. at the moment, the committers have started focusing on getting 1.2 released. Even if this was a relaly popular issue, it's a non trivial change that we probably would not want to rush before the release. -Hoss
facet should add facet.analyzer
facet.analyzer is true, do analyze, if false don't analyze. why i say that, Chinese word not use space to split, so if analyzed, it will change. now i will use map to fix it before no facet.analyzer. -- regards jl
Re: solr plugins
Hi Yonik: I updated with a patch. Is it possible to get this in soon cuz I have a client waiting on this. Thanks again -John On 5/22/07, John Wang <[EMAIL PROTECTED]> wrote: Hi Yonik: Thank you again for your help! I created an improvement item in jira (SOLR-243) on this. -John On 5/19/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On 5/19/07, John Wang < [EMAIL PROTECTED]> wrote: > > Hi Yonik: > > > > Thanks for the info! > > > > This solves my problem, but not elegantly. > > > > I have a custome implementation where I derived from the > > IndexReader class to store some custome data. Now I am trying to write > > a Solr plugin for my search implementation but I want to be able to > > use my IndexReader implementation. > > > > Is there a way to overwrite the IndexReader instantiation? e..g > > IndexReader newReader() etc. > > Not currently, but it might be a useful feature. > > -Yonik >
Re: AW: Re[2]: add and delete docs at same time
On 25-May-07, at 2:49 AM, Burkamp, Christian wrote: Thierry, If you always start from scratch you could even reset the index completely (i.e. delete the index directory). Solr will create a new index automatically at startup. This will also make indexing and optimizing much faster for any non- trivial size index. -Mike
RE: field display values
This would require some storage when the index is built to map between the internal field name and the "display name" ... since this is not a Lucene concept it would have to be a higher level concept hat Solr write to disk directly -- there are currently no concepts like this but that doens't mean there can't be. the question becomes: "Is this the type of data that Solr *should* store?" ... in my opinion the answer is no. I can't think of any value add in having Solr keep track of the fact that "ds" means "Download Speed" vs having an external data mapping keep track of that information, since direct access to that info inside of Solr wouldn't typically make the performance of requests any faster or reduce the size of the responses, it seems like the type of data that make more senese to maintain externally. as to your specific situation... : I would normally agree but the problem is that I'm making very heavy use : of the dynamic fields and therefore don't really know what a record : looks like. Ie the only thing that knows about the data is the input : data itself. I've added logic to 'solrify' the input field names as : they come to me in the "Download Speed" format but making the reverse : happen is impossible from the client side because each record is : different. ...if every document is truely differnet, then the "ds" field for one doc may not be the same as the "ds" field for another doc ... which makes it sound like hte field display names themselves are document specific 'data' that should be stored as field values. I have a lot of personal experience with an app (the first Solr app actually) where the dynamic fields a doc has depend on it's category, and i actually put the info about the fields (including their display names and info on how to facet on them) into stored fields special "metadata documents" which go into the index and then a custom request handler first says "what category am i interested in?" to find the relevant metadata doc, and then uses the info found in that doc to both query the index for the "real" results, as well as to return the "display" values for all of the important fields. if you can particion your index in this way, then similar metadata docs might mke sense for you ... if you can't (becuase every doc turely is differnet) then making the "real" documents also store the "metadata" about field names can work just as well. -Hoss
Re: field display values
I had a similar issue with a heavy use of dynamic fields. You first want to get those spaces out of there. Lucene does not like spaces in field names. So, I just replaced the space with a rarely used character (ASCII 8 or something like that). I did this in my indexing. And then I just translate between the Lucene encoded field name (without spaces) and my display field name (with spaces) when I go back and forth between Solr and my client. So, your "Download Speed"=>"DownloadSpeed"=>"Download Speed". Spaces are the only characters that seemed to cause problems. It seems to work just fine. - Original Message From: Will Johnson <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Friday, May 25, 2007 1:48:22 PM Subject: RE: field display values I would normally agree but the problem is that I'm making very heavy use of the dynamic fields and therefore don't really know what a record looks like. Ie the only thing that knows about the data is the input data itself. I've added logic to 'solrify' the input field names as they come to me in the "Download Speed" format but making the reverse happen is impossible from the client side because each record is different. - will -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Friday, May 25, 2007 4:32 PM To: solr-user@lucene.apache.org Subject: Re: field display values Will Johnson wrote: > Has anyone done anything interesting to preserve display values for > field names. Ie my users would like to see > > Download Speed (MB/sec): 5 > > As opposed to: > > ds:5 > > The general model has been to think of solr like SQL... it is only the database - display choices should be at the client side. It seems easy enough to have a map on the client with: "ds" => "Download Speed (MB/sec)" That said, something like the sql 'as' command would be useful: SELECT ds as `Download Speed (MB/sec)` FROM table...; rather then define the field name at index time (as your example suggests) it makes more sense to define it at query time (or as a default in the RequestHandler config) Maybe something like: /select?fl=ds&display.ds=Download Speed (MB/sec) Maybe this would be a way to specify date formatting? /select?fl=timestamp&display.timestamp='Year'&display.format.timestamp=Y YYY just thoughts...
RE: field display values
I would normally agree but the problem is that I'm making very heavy use of the dynamic fields and therefore don't really know what a record looks like. Ie the only thing that knows about the data is the input data itself. I've added logic to 'solrify' the input field names as they come to me in the "Download Speed" format but making the reverse happen is impossible from the client side because each record is different. - will -Original Message- From: Ryan McKinley [mailto:[EMAIL PROTECTED] Sent: Friday, May 25, 2007 4:32 PM To: solr-user@lucene.apache.org Subject: Re: field display values Will Johnson wrote: > Has anyone done anything interesting to preserve display values for > field names. Ie my users would like to see > > Download Speed (MB/sec): 5 > > As opposed to: > > ds:5 > > The general model has been to think of solr like SQL... it is only the database - display choices should be at the client side. It seems easy enough to have a map on the client with: "ds" => "Download Speed (MB/sec)" That said, something like the sql 'as' command would be useful: SELECT ds as `Download Speed (MB/sec)` FROM table...; rather then define the field name at index time (as your example suggests) it makes more sense to define it at query time (or as a default in the RequestHandler config) Maybe something like: /select?fl=ds&display.ds=Download Speed (MB/sec) Maybe this would be a way to specify date formatting? /select?fl=timestamp&display.timestamp='Year'&display.format.timestamp=Y YYY just thoughts...
Re: field display values
Will Johnson wrote: Has anyone done anything interesting to preserve display values for field names. Ie my users would like to see Download Speed (MB/sec): 5 As opposed to: ds:5 The general model has been to think of solr like SQL... it is only the database - display choices should be at the client side. It seems easy enough to have a map on the client with: "ds" => "Download Speed (MB/sec)" That said, something like the sql 'as' command would be useful: SELECT ds as `Download Speed (MB/sec)` FROM table...; rather then define the field name at index time (as your example suggests) it makes more sense to define it at query time (or as a default in the RequestHandler config) Maybe something like: /select?fl=ds&display.ds=Download Speed (MB/sec) Maybe this would be a way to specify date formatting? /select?fl=timestamp&display.timestamp='Year'&display.format.timestamp= just thoughts...
field display values
Has anyone done anything interesting to preserve display values for field names. Ie my users would like to see Download Speed (MB/sec): 5 As opposed to: ds:5 there are options for doing fancy encoding of field names but those seem less that ideal. What I'd really like to do is at add time: hi And then at result time: hi I've thought of having custom request handlers save this info away and then add it back in with a customer response writer but this seemed like it might be a more generally useful type of thing to have. Thoughts, ideas? - will
Re: Problem with machine hostname and Solr/Tomcat
: Anyone encounter a problem when changing their hostname? (via : /etc/conf.d/hostname or just the hostname command) I'm getting this error : when going to the admin screen, I have a feeling it's a simple fix. It : seems to work when it thinks the machine's name is just 'localhost'. i don't think this is a tomcat or solr issue ... it looks like a basic java/dns issue (that can most likely be reproduced with a 4 line commandline java app for testing) Take a look at the InetAddress jvadocs, specificly the info on Caching. my guess is either: 1) your reverse name lookup doesn't match the name you are using which causes the getLocalHost call to freak out becuase it can't do a DNS lookup on the hostname it thinks it is. 2) you changed the name while the JVM was running, and the "forever" cache is returning a name that no longer exists. -Hoss
Re: read only indexes?
We're controlling this with Tomcat configuration on our end. I'm not a servlet-container guru, but I would imagine similar capabilities exist on Jetty, et al. -- j On 5/24/07, Ryan McKinley <[EMAIL PROTECTED]> wrote: Is there a good way to force an index to be read-only? I could configure a dummy handler to sit on top of /update and throw an error, but i'd like a stronger assurance that nothing can call UpdateHandler.addDoc()
Re: Difficulty posting unicode to solr index
On 5/25/07, Ethan Gruber <[EMAIL PROTECTED]> wrote: Posting utf8-example.xml is the first thing I tried when I ran into this problem, and like the other files I had been working with, query results return garbage characters inside of unicode. After posting utf8-example.xml, try this query: http://localhost:8983/solr/select?indent=on&q=id%3AUTF8TEST&fl=features&wt=python The python writer uses unicode escapes to keep the output in the ascii range, so it's an easy way to see exactly what Solr thinks those characters are. You should get { 'responseHeader':{ 'status':0, 'QTime':0, 'params':{ 'wt':'python', 'indent':'on', 'q':'id:UTF8TEST', 'fl':'features'}}, 'response':{'numFound':1,'start':0,'docs':[ { 'features':[ 'No accents here', u'This is an e acute: \u00e9', u'eaiou with circumflexes: \u00ea\u00e2\u00ee\u00f4\u00fb', u'eaiou with umlauts: \u00eb\u00e4\u00ef\u00f6\u00fc', 'tag with escaped chars: ', 'escaped ampersand: Bonnie & Clyde']}] }} If you do, that means that the problem is not getting the data into solr, but the interpretation of what you get out. -Yonik
Re: read only indexes?
Didn't somebody talk about providing Solr with a custom (subclass of) IndexReader here on the list the other day? Perhaps then a ReadOnlyIndexWriter with an appropriately overriden delete methods might be one approach to this. Or chmod -w? ;) Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Ryan McKinley <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Thursday, May 24, 2007 2:35:44 PM Subject: read only indexes? Is there a good way to force an index to be read-only? I could configure a dummy handler to sit on top of /update and throw an error, but i'd like a stronger assurance that nothing can call UpdateHandler.addDoc()
Re: Difficulty posting unicode to solr index
Posting utf8-example.xml is the first thing I tried when I ran into this problem, and like the other files I had been working with, query results return garbage characters inside of unicode. On 5/25/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 5/25/07, Ethan Gruber <[EMAIL PROTECTED]> wrote: > Yes, it's definitely encoded in UTF-8. I'm going to attempt either today or > Tuesday to post the files to a solr index that is online (as opposed to > localhost as was my case a few days ago) using post.sh through SSH and let > you know how it turns out. That should definitely indicate whether or not > the problem is with my files themselves or the post.jar file. Why don't you try a file that we know is encoded in UTF-8, the solr/example/exampledocs/utf8-example.xml Try it first without modifying it (an editor can change the encoding a file is stored in). -Yonik
Re: Difficulty posting unicode to solr index
On 5/25/07, Ethan Gruber <[EMAIL PROTECTED]> wrote: Yes, it's definitely encoded in UTF-8. I'm going to attempt either today or Tuesday to post the files to a solr index that is online (as opposed to localhost as was my case a few days ago) using post.sh through SSH and let you know how it turns out. That should definitely indicate whether or not the problem is with my files themselves or the post.jar file. Why don't you try a file that we know is encoded in UTF-8, the solr/example/exampledocs/utf8-example.xml Try it first without modifying it (an editor can change the encoding a file is stored in). -Yonik
Re: AW: Re[2]: add and delete docs at same time
Just to be clear, [* TO *] does not necessarily return all documents. It returns all documents that have a value in the specified (or default) field. Be careful with that! *:*, however, does match all documents. Erik On May 25, 2007, at 5:49 AM, Burkamp, Christian wrote: Thierry, If you always start from scratch you could even reset the index completely (i.e. delete the index directory). Solr will create a new index automatically at startup. If you don't like to delete the files another approach would be to use a query that returns all documents. You do not need a dummy field for this. The range query [* TO *] returns all documents. (In newer versions of solr you can use *:* which is executing a bit faster. -- Christian -Ursprüngliche Nachricht- Von: Thierry Collogne [mailto:[EMAIL PROTECTED] Gesendet: Freitag, 25. Mai 2007 10:30 An: solr-user@lucene.apache.org; Jack L Betreff: Re: Re[2]: add and delete docs at same time We always do a full delete before indexing, this is because for us that is the only way to be sure that there are no documents in the index that don't exist anymore. So delete all, than add all. To use the delete all, we did the following. We added a field called dummyDelete. This field always contains the value delete. Like this delete Then to delete all documents we do a request containing: dummyDelete:delete That way all documents are deleted where the field dummyDelete contains delete => all the documents Hope this is clear. I am not sure if this is a good solution, but it does work. :) Greet, Thierry On 25/05/07, Jack L <[EMAIL PROTECTED]> wrote: Oh, is that the case? One document per request for delete? I'm about to implement delete. Just want to confirm. -- Best regards, Jack Thursday, May 24, 2007, 12:47:21 PM, you wrote: currently no. Right now you even need a new request for each delete... Patrick Givisiez wrote: can I add and delete docs at same post? Some thing like this: myDocs.xml = 4 5 6 1 2 3 = Thanks!
RE: index problem with write lock
I think I had the same problem (the same error at least) and submitted a patch. The patch adds a new config option to use the nio locking facilities instead of the default lucene locking. In the ~week since I haven't seen the issue after applying the patch (ymmv) https://issues.apache.org/jira/browse/SOLR-240 - will -Original Message- From: Chris Hostetter [mailto:[EMAIL PROTECTED] Sent: Friday, May 25, 2007 1:50 AM To: solr-user@lucene.apache.org Subject: Re: index problem with write lock : i know how to fix it. : : but i just don't know why it happen. : : this solr error information: : : > Exception during commit/optimize:java.io.IOException: Lock obtain timed : > out: SimpleFSLock@/usr/solrapp/solr21/data/index/write.lock that's the problem you see ... but in normal SOlr operation there's no reason why there should be any problem getting the write lock -- Solr only ever makes one IndexWriter at a time. which is why i asked about any other errors earlier in your log (possibly much earlier) to indicate *abnormal* Solr operation. -Hoss
Re: Doubt in using synonyms.txt
Thanks Yonik. Regards, Doss. On 5/25/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: On 5/24/07, Doss <[EMAIL PROTECTED]> wrote: > Is it advisable to maintain a large amount of data in synonyms.txt file? It's read into an in-memory map, so the only real impact is increased RAM usage. There really shouldn't be a performance impact. -Yonik
Problem with machine hostname and Solr/Tomcat
Anyone encounter a problem when changing their hostname? (via /etc/conf.d/hostname or just the hostname command) I'm getting this error when going to the admin screen, I have a feeling it's a simple fix. It seems to work when it thinks the machine's name is just 'localhost'. org.apache.jasper.JasperException: Exception in JSP: /admin/_info.jsp:43 40: } 41: 42: String collectionName = schema!=null ? schema.getName():"unknown"; 43: InetAddress addr = InetAddress.getLocalHost(); 44: String hostname = addr.getCanonicalHostName(); 45: 46: String defaultSearch = ""; Stacktrace: org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:467) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:377) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:315) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265) javax.servlet.http.HttpServlet.service(HttpServlet.java:803) org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:133) root cause java.net.UnknownHostException: app10: app10 java.net.InetAddress.getLocalHost(InetAddress.java:1308) org.apache.jsp.admin.index_jsp._jspService(index_jsp.java:95) org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:98) javax.servlet.http.HttpServlet.service(HttpServlet.java:803) org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:328) org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:315) org.apache.jasper.servlet.JspServlet.service(JspServlet.java:265) javax.servlet.http.HttpServlet.service(HttpServlet.java:803) org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:133) -- View this message in context: http://www.nabble.com/Problem-with-machine-hostname-and-Solr-Tomcat-tf3816176.html#a10803121 Sent from the Solr - User mailing list archive at Nabble.com.
unsubcribe
unsubcribe
Re: Difficulty posting unicode to solr index
Yes, it's definitely encoded in UTF-8. I'm going to attempt either today or Tuesday to post the files to a solr index that is online (as opposed to localhost as was my case a few days ago) using post.sh through SSH and let you know how it turns out. That should definitely indicate whether or not the problem is with my files themselves or the post.jar file. On 5/24/07, James liu <[EMAIL PROTECTED]> wrote: how do u sure ur file is encoded by utf-8? 2007/5/24, Ethan Gruber <[EMAIL PROTECTED]>: > > Hi, > > I am attempting to post some unicode XML documents to my solr > index. They > are encoded in UTF-8. When I attempt to query from the solr admin page, > I'm > basically getting gibberish garbage text in return. I decided to try a > file > that I know is supposed to work, which is the utf8-example.xml found in > the > exampledocs folder. This also did not return proper unicode > results. None > of my other coworkers have run into this problem, but I believe there is > one > difference between their system and my system which could account for > the > error. They're using Macs and thus posting with post.sh, and I am > running > Windows and posting with a post.jar file. Could post.jar not support > unicode? Has anyone run into this problem before? > > Thanks, > Ethan > -- regards jl
The function of distinct of RDBMS
Hi my name is Techan. I want to put the function of distinct of RDBMS in solr. I want to use any field. However, whether it solves it in detail like any is not understood. Do you know someone? (I'm sorry about computing english.)
AW: Re[2]: add and delete docs at same time
Thierry, If you always start from scratch you could even reset the index completely (i.e. delete the index directory). Solr will create a new index automatically at startup. If you don't like to delete the files another approach would be to use a query that returns all documents. You do not need a dummy field for this. The range query [* TO *] returns all documents. (In newer versions of solr you can use *:* which is executing a bit faster. -- Christian -Ursprüngliche Nachricht- Von: Thierry Collogne [mailto:[EMAIL PROTECTED] Gesendet: Freitag, 25. Mai 2007 10:30 An: solr-user@lucene.apache.org; Jack L Betreff: Re: Re[2]: add and delete docs at same time We always do a full delete before indexing, this is because for us that is the only way to be sure that there are no documents in the index that don't exist anymore. So delete all, than add all. To use the delete all, we did the following. We added a field called dummyDelete. This field always contains the value delete. Like this delete Then to delete all documents we do a request containing: dummyDelete:delete That way all documents are deleted where the field dummyDelete contains delete => all the documents Hope this is clear. I am not sure if this is a good solution, but it does work. :) Greet, Thierry On 25/05/07, Jack L <[EMAIL PROTECTED]> wrote: > > Oh, is that the case? One document per request for delete? > I'm about to implement delete. Just want to confirm. > > -- > Best regards, > Jack > > Thursday, May 24, 2007, 12:47:21 PM, you wrote: > > > currently no. > > > Right now you even need a new request for each delete... > > > > Patrick Givisiez wrote: > >> > >> can I add and delete docs at same post? > >> > >> Some thing like this: > >> > >> myDocs.xml > >> = > >> > >> 4 >> name="mainId">5 >> name="mainId">6 1 > >> 2 3 > >> = > >> > >> Thanks! > >> > >> > >> > >> > >
Re: Re[2]: add and delete docs at same time
We always do a full delete before indexing, this is because for us that is the only way to be sure that there are no documents in the index that don't exist anymore. So delete all, than add all. To use the delete all, we did the following. We added a field called dummyDelete. This field always contains the value delete. Like this delete Then to delete all documents we do a request containing: dummyDelete:delete That way all documents are deleted where the field dummyDelete contains delete => all the documents Hope this is clear. I am not sure if this is a good solution, but it does work. :) Greet, Thierry On 25/05/07, Jack L <[EMAIL PROTECTED]> wrote: Oh, is that the case? One document per request for delete? I'm about to implement delete. Just want to confirm. -- Best regards, Jack Thursday, May 24, 2007, 12:47:21 PM, you wrote: > currently no. > Right now you even need a new request for each delete... > Patrick Givisiez wrote: >> >> can I add and delete docs at same post? >> >> Some thing like this: >> >> myDocs.xml >> = >> >> 4 >> 5 >> 6 >> >> 1 >> 2 >> 3 >> = >> >> Thanks! >> >> >> >>