[jira] Commented: (SOLR-20) A simple Java client for updating and searching
[ https://issues.apache.org/jira/browse/SOLR-20?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12466970 ] Ryan McKinley commented on SOLR-20: --- I have dramatically reworked the client code to fit with the pluggable ContentStream model in SOLR-104. This version makes it easy to customize/extend the request/response behavior. Once it stabilizes and is better tested, I'll upload a zip, but for now, you can preview it: http://svn.lapnap.net/solr/solrj/ Major changes: * it is based on commons-httpclient-3.0.1.jar * I'm using wt=JSON rather then XML. (It maps to a hash easier) * I moved some of the common classes to o.a.s.util. Hopefully the core classes will be refactored to make it easier to share some classes * handles multiple ContentStreams using multi-part form upload * Got rid of the SolrDocumentable/SolrDocumented distinction. -- now there is only SolrDocument() * You can define and automatically build a solr document with annotations * Includes a first draft for a HibernateEventListener. When stuff is added/updated/deleted, it gets sent to solr. (Note, this class should probable not be in the main client as the hibernate prerequisite libraries are substantial - I've included them because its what i need to have working soon) When this is more stable, it will be something similar to a Compass Hibernate3GpsDevice (http://www.opensymphony.com/compass/versions/1.1RC1/html/gps-hibernate.html) The key interfaces are: public interface SolrClient { public abstract SolrResponse process( final SolrRequest req ); } public interface SolrRequest { public String getMethod(); public String getHandlerPath(); public RequestParams getParams(); public CollectionContentStream getContentStreams(); public SolrResponse parseResponseBody(InputStream in); public SolrResponse execute(SolrClient solr); } - - - - - - - Here is some sample usage: SolrClient client = new CommonsHttpSolrClient( new URL(http://localhost:8983/solr/;) ); // Set up a simple query SolrQuery query = new SolrQuery(); query.setQuery( solr ); query.addFacetField( cat ); query.setFacetLimit( 15 ); query.setQuery( video ); query.setShowDebugInfo( true ); QueryResponse rsp = query.execute( client ); for( ResultDoc doc : rsp.getDocs() ) { System.out.println( doc.get( name ) ); System.out.println( doc.getScore() ); System.out.println( doc.getExplain() ); } SimpleSolrDoc doc = new SimpleSolrDoc(); doc.setField( id, xxx ); doc.setField( price, 12.34f ); doc.setField( cat, new String[] { aaa, bbb, ccc } ); new AddDocuments( doc ).execute( client ); new CommitIndex().execute( client ); - - - - - - - - - - - This also includes a utility to make solr documents from annotations. Given the class: @SolrSearchable( boost=2.0 ) public class Example { @SolrSearchable public String getName() { return hello } @SolrSearchable( name=cat, boost=3 ) public String getSomeOtherName() { return there } } The DocumentBuilder can automatically make: doc boost=2.0 field name=namehello/field field name=cat boost=3there/field /doc - - - - - - - - - - - There are a few parts of the API i think are awkward, I'd love any feedback / review you may have. thanks ryan A simple Java client for updating and searching --- Key: SOLR-20 URL: https://issues.apache.org/jira/browse/SOLR-20 Project: Solr Issue Type: New Feature Components: clients - java Environment: all Reporter: Darren Erik Vengroff Priority: Minor Attachments: DocumentManagerClient.java, DocumentManagerClient.java, solr-client-java-2.zip.zip, solr-client-java.zip, solr-client-sources.jar, solr-client.zip, solr-client.zip, solr-client.zip, SolrClientException.java, SolrServerException.java I wrote a simple little client class that can connect to a Solr server and issue add, delete, commit and optimize commands using Java methods. I'm posting here for review and comments as suggested by Yonik. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: solrb releases
On Jan 24, 2007, at 11:35 AM, Erik Hatcher wrote: I'd rather us do it all here at ASF to keep it in-house and tightly synced with Solr itself. You can run your own gem server very easily. Perhaps a gem server for apache projects might be a good long term goal? In my opinion putting the gem on rubyforge and being able to say: gem install solr has it's advantages. //Ed
Re: solrb releases
On Jan 24, 2007, at 11:54 AM, Edward Summers wrote: On Jan 24, 2007, at 11:35 AM, Erik Hatcher wrote: I'd rather us do it all here at ASF to keep it in-house and tightly synced with Solr itself. You can run your own gem server very easily. Perhaps a gem server for apache projects might be a good long term goal? The Apache distribution system is set up to do heavy-duty mirroring. How would that factor into the gem server situation? Apache itself doesn't serve most of the releases from its own hardware. In my opinion putting the gem on rubyforge and being able to say: gem install solr has it's advantages. Indeed, but it isn't the end of the world for them to add a --source on there. It'll be solrb though, not solr - it seems better to keep it a separate name so it doesn't appear that you're actually installing Solr too. Erik
Re: solrb releases
On Jan 24, 2007, at 12:11 PM, Erik Hatcher wrote: Indeed, but it isn't the end of the world for them to add a -- source on there. Absolutely not the end of the world--just not the ruby way :-) It'll be solrb though, not solr - it seems better to keep it a separate name so it doesn't appear that you're actually installing Solr too. So people will: require 'solrb' and we need to: svn mv lib/solr lib/solrb svn mv lib/solr.rb lib/solrb.rb ? I actually like just calling it solr myself since it's easier on the eyes and involves less work. //Ed
Re: solrb releases
On 1/24/07, Erik Hatcher [EMAIL PROTECTED] wrote: On Jan 24, 2007, at 11:53 AM, Bertrand Delacretaz wrote: On 1/24/07, Erik Hatcher [EMAIL PROTECTED] wrote: ...Yonik/Hoss, others - what do you think should be done to make releases?... If you mean an actual release (defined in [1] as any publication outside the group of people on the product dev list), the PMC must vote to approve it, and it must comply with the ASF licensing requirements (LICENSE and NOTICE files, etc). And it should be mirrored, dunno how this would work for Ruby packages? So some bureaucracy would be involved I guess. Release candidates are much more lightweight according to [1]. IIUC, putting up a gem package as you mention fits that definition, so that might be a good way of getting your stuff tested. Yes, definitely this is a release candidate. Thanks for the info. I'll review the link you sent and update the codebase with any missing pieces, and any other recommendations on this thread before releasing. I would think we could set up a nightly build to go to people.apache.org/builds/lucene/solr/solrb/nightly and perhaps release candidates could go in people.apache.org/builds/lucene/solr/solrb/ -Yonik
[jira] Commented: (SOLR-20) A simple Java client for updating and searching
[ https://issues.apache.org/jira/browse/SOLR-20?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467116 ] Yonik Seeley commented on SOLR-20: -- * it is based on commons-httpclient-3.0.1.jar Cool, +1 * I'm using wt=JSON rather then XML. (It maps to a hash easier) Heh... I quickly checked out the code, but didn't see where you were parsing the code, or where the JSONObject class referenced is. Anyway, if you want the *best* JSON parser on the planet, check out http://www.nabble.com/Apache-Lab-proposal%3A-noggit-tf2701405.html#a7532843 http://svn.apache.org/repos/asf/labs/noggit/ :-) I haven't had a chance to do the writing side, or the create full object graph part, but the parser is screaming fast. * handles multiple ContentStreams using multi-part form upload Will a client need to do that? I had thought a browser would be the only one using multi-part * You can define and automatically build a solr document with annotations Sounds cool * Includes a first draft for a HibernateEventListener. Sounds *very* cool... it should go in a separate contrib eventually. A simple Java client for updating and searching --- Key: SOLR-20 URL: https://issues.apache.org/jira/browse/SOLR-20 Project: Solr Issue Type: New Feature Components: clients - java Environment: all Reporter: Darren Erik Vengroff Priority: Minor Attachments: DocumentManagerClient.java, DocumentManagerClient.java, solr-client-java-2.zip.zip, solr-client-java.zip, solr-client-sources.jar, solr-client.zip, solr-client.zip, solr-client.zip, SolrClientException.java, SolrServerException.java I wrote a simple little client class that can connect to a Solr server and issue add, delete, commit and optimize commands using Java methods. I'm posting here for review and comments as suggested by Yonik. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (SOLR-20) A simple Java client for updating and searching
* I'm using wt=JSON rather then XML. (It maps to a hash easier) Heh... I quickly checked out the code, but didn't see where you were parsing the code, or where the JSONObject class referenced is. Anyway, if you want the *best* JSON parser on the planet, check out http://www.nabble.com/Apache-Lab-proposal%3A-noggit-tf2701405.html#a7532843 http://svn.apache.org/repos/asf/labs/noggit/ :-) I haven't had a chance to do the writing side, or the create full object graph part, but the parser is screaming fast. I'm using a slightly modified version of the json.org code. It stores things in a LinkedHashMap (to maintain order) and formats dates explicitly. http://svn.lapnap.net/solr/solrj/src/org/apache/solr/util/json/ I just had a quick look at noggit. It looks interesting. If I understand it correctly, noggit is to XPP as the json.org code is to DOM * handles multiple ContentStreams using multi-part form upload Will a client need to do that? I had thought a browser would be the only one using multi-part I don't know if it will *need* to do it - but I think the client API should be as close to the RequestHandler API as possible. Since (in SOLR-104) the server side accepts IterableContentStream, the client should send IterableContentStream. With commons http-client, sending multi-part data is almost equivolent to sending form data. * You can define and automatically build a solr document with annotations Sounds cool * Includes a first draft for a HibernateEventListener. Sounds *very* cool... it should go in a separate contrib eventually. Yes, it is a single class - quite small really - but has 8MB of required .jar files to compile and test it! When it is more stable and i have some time, i'll seperate it into two projects. - - - - - The big API question/style i'm struggling with is SolrResponse rsp = client.process( req ); vs SolrResponse rsp = req.execute( client ); // execute may not be the right word The first one is more natural, and is how things are actually processed. The second one allows eliminates the need for lots of casting: SolrQueryResponse rsp = (SolrQueryResponse)client.process( req ); vs SolrQueryResponse rsp = queryRequest.execute( client ); Any thoughts?
Re: [jira] Commented: (SOLR-20) A simple Java client for updating and searching
On 1/24/07, Ken Krugler [EMAIL PROTECTED] wrote: Hi Ryan, The big API question/style i'm struggling with is SolrResponse rsp = client.process( req ); vs SolrResponse rsp = req.execute( client ); // execute may not be the right word The first one is more natural, and is how things are actually processed. The second one allows eliminates the need for lots of casting: SolrQueryResponse rsp = (SolrQueryResponse)client.process( req ); vs SolrQueryResponse rsp = queryRequest.execute( client ); Any thoughts? Haven't dug into the client code, but my natural inclination would be to go with the latter...it fits better with how I'd write the test code if I was implementing such a thing. E.g. create a mock client and then use that to test the request object. I think I agree with the latter, although I probably would have coded the former just because it would have occured to me first. The latter allows an easy way to create new request classes w/o having to couple tightly to the client. A type of request could make two calls two calls to the server, and join the responses or process them in different ways. I like it! -Yonik
Re: solrb releases
On Jan 24, 2007, at 4:49 PM, Erik Hatcher wrote: What I'd like to do is get Apache moved towards being Ruby savvy. The ideal, to me, is getting deployments of releases to work cleanly with gem install from the Apache infrastructure. No offense to rubyforge at all (hi Rich and Tom!), I'm aiming idealistically at making apache.org a 1st class Ruby software player. That sounds great, but I would be inclined to go with the flow of the ruby community if you want your gem to fit into it. Just my opinion though--I've expressed it and now I won't say it again. It's a good question. My take was to call the .gem solrb, but the library itself is solr, which is how it currently is set up. I'm not quite sure though. I'm agile, and can be convinced to change it. Well having a gem named solrb and a library named solr is confusing IMHO. I've done this myself in the past and regretted the inconsistency. But hey, I don't really care enough to try to convince you any more than I have tried already. //Ed
[jira] Commented: (SOLR-69) PATCH:MoreLikeThis support
[ https://issues.apache.org/jira/browse/SOLR-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12467179 ] Bertrand Delacretaz commented on SOLR-69: - Intuitively, without having checked exactly how it's implemented, I think MoreLikeThis queries should work irrelevant of whether fields are stored or not, as it's based on what's indexed. Maybe someone who knows Lucene's internals better than I do can comment. Did you find a case where non-stored fields cause problems? PATCH:MoreLikeThis support -- Key: SOLR-69 URL: https://issues.apache.org/jira/browse/SOLR-69 Project: Solr Issue Type: Improvement Components: search Reporter: Bertrand Delacretaz Priority: Minor Attachments: lucene-queries-2.0.0.jar, SOLR-69.patch, SOLR-69.patch, SOLR-69.patch Here's a patch that implements simple support of Lucene's MoreLikeThis class. The MoreLikeThisHelper code is heavily based on (hmm...lifted from might be more appropriate ;-) Erik Hatcher's example mentioned in http://www.mail-archive.com/solr-user@lucene.apache.org/msg00878.html To use it, add at least the following parameters to a standard or dismax query: mlt=true mlt.fl=list,of,fields,which,define,similarity See the MoreLikeThisHelper source code for more parameters. Here are two URLs that work with the example config, after loading all documents found in exampledocs in the index (just to show that it seems to work - of course you need a larger corpus to make it interesting): http://localhost:8983/solr/select/?stylesheet=q=apacheqt=standardmlt=truemlt.fl=manu,catmlt.mindf=1mlt.mindf=1fl=id,score http://localhost:8983/solr/select/?stylesheet=q=apacheqt=dismaxmlt=truemlt.fl=manu,catmlt.mindf=1mlt.mindf=1fl=id,score Results are added to the output like this: response ... lst name=moreLikeThis result name=UTF8TEST numFound=1 start=0 maxScore=1.5293242 doc float name=score1.5293242/float str name=idSOLR1000/str /doc /result result name=SOLR1000 numFound=1 start=0 maxScore=1.5293242 doc float name=score1.5293242/float str name=idUTF8TEST/str /doc /result /lst I haven't tested this extensively yet, will do in the next few days. But comments are welcome of course. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (SOLR-20) A simple Java client for updating and searching
This might seem outlandish but have you considered modeling a server instead of a client? Then you can send request messages to it and get back response messages. SolrSelectResponse response = server.select(selectOptions); I like the model, but the I want to be able easily write a client for a custom SolrRequestHandler. The server model can't be tied directly to update(), select(), ping() etc right now I'm working with: public interface SolrServer { SolrResponse request( final String path, final METHOD method, final RequestParams params, final CollectionContentStream streams, final ResponseStreamProcessor processor ) ; } public interface ResponseStreamProcessor { SolrResponse processResponseStream( InputStream body ); } public interface SolrRequest { SolrResponse process( SolrServer server ) ; } - - - - - - - - - - This would be a sample 'ping' request: public class SolrPing implements SolrRequest { public SolrResponse process(SolrServer server) { return (SolrPingResponse)server.request( /admin/ping, METHOD.GET, null, null, new ResponseStreamProcessor() { public SolrResponse processResponseStream(InputStream body) { // something real would actually process the body return new SolrPingResponse(); } }); } } And you can call it with: new SolrPing().process( server );
Re: facet response
On Jan 22, 2007, at 6:14 PM, Yonik Seeley wrote: Chris Hostetter [EMAIL PROTECTED] wrote: as i said, i'd rather invert the use case set to find where ordering isn't important and change those to Maps That might be a *lot* of changes... What's currently broken, just faceting or anything else? Faceting is the only thing I've come upon. After playing with this more and contemplating all the messages on this thread, I can't say that it's broken, but telling solr to sort things and then when pulling them back out on the other end in seemingly random order it sure feels that way. Re-sorting on the client is the easiest solution and I've gone that route for now. I plan on digging into the JSON option a bit and seeing if order is preserved, though I doubt it would be any difference since it will surely parse back into a Hash by default. Though the json.nl.arr=arr would surely preserve order, though that changes the access to things all over the place on the client. Having the facet_counts area output as an ordered list in all cases seems the most sensible to me, since it is unlikely that the facets would be accessed by key. But, again, resorting on the client is sufficient for me for now. Erik
Re: facet.missing?
On Jan 23, 2007, at 4:52 PM, J.J. Larrea wrote: In fact, it would probably please too-lazy-to-translate-facet-value- strings front-end coders if there were a way to provide a label for the missing count, e.g. str name=facet.missinglabelNone/str str name=f.author.facet.missinglabelNo authorship indicated/str There is much more to facet name mapping for display in the grand scheme of things. I18N is something I'm interested in building into Flare such that facets get returned and looked up for the proper label in the users locale. I think having a missing label feature is unnecessary at this point. We're already on parameter overload as it is, so pushing back on new ones seems prudent. Or better yet facet.missing itself could be a string rather than a boolean, with the missing count suppressed if it is undefined, null, 'false', or empty, and 'true' enabling it with a null label for reverse compatibility. It's too risky that the string would correspond with a value in the facet field too, which would cause confusion or even an error if the client was expecting unique values. Erik
[jira] Updated: (SOLR-84) New Solr logo?
[ https://issues.apache.org/jira/browse/SOLR-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clay Webster updated SOLR-84: - Attachment: logo-solr-source-files-take2.zip I took the contents of the zip file and made a logo without the crescent, but with an 'o' that has the diffused rays where the crescent used to be. In the new zip are a full size version, an original logo sized version, and a new cropped version. New Solr logo? -- Key: SOLR-84 URL: https://issues.apache.org/jira/browse/SOLR-84 Project: Solr Issue Type: Improvement Reporter: Bertrand Delacretaz Priority: Minor Attachments: logo-solr-source-files-take2.zip, solr-84-source-files.zip, solr-logo-20061214.jpg, solr-logo-20061218.JPG Following up on SOLR-76, our trainee Nicolas Barbay (nicolas (put at here) sarraux-dessous.ch) has reworked his logo proposal to be more solar. This can either be the start of a logo contest, or if people like it we could adopt it. The gradients can make it a bit hard to integrate, not sure if this is really a problem. WDYT? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (SOLR-84) New Solr logo?
[ https://issues.apache.org/jira/browse/SOLR-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Clay Webster updated SOLR-84: - Attachment: solr-logo-20070124.JPG New Solr logo? -- Key: SOLR-84 URL: https://issues.apache.org/jira/browse/SOLR-84 Project: Solr Issue Type: Improvement Reporter: Bertrand Delacretaz Priority: Minor Attachments: logo-solr-source-files-take2.zip, solr-84-source-files.zip, solr-logo-20061214.jpg, solr-logo-20061218.JPG, solr-logo-20070124.JPG Following up on SOLR-76, our trainee Nicolas Barbay (nicolas (put at here) sarraux-dessous.ch) has reworked his logo proposal to be more solar. This can either be the start of a logo contest, or if people like it we could adopt it. The gradients can make it a bit hard to integrate, not sure if this is really a problem. WDYT? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: [jira] Commented: (SOLR-20) A simple Java client for updating and searching
On Jan 24, 2007, at 6:53 PM, Ryan McKinley wrote: new SolrPing().process( server ); doesn't server.ping(); look cleaner? I can't argue with you there! I may be taking Hoss's point #4 on https://issues.apache.org/jira/browse/SOLR-20#action_12464641 too seriously. My revised version is designed totally around Requests processing a handler's response. The server does not even know what handlers it may be running - it is just the transport layer between the request and the response. But that may be too big of an abstraction for the simple use case. server.ping() server.add( documents ); server.delete( id ); server.select( options ); ... are obviously cleaner. ryan
Re: [jira] Commented: (SOLR-20) A simple Java client for updating and searching
On 1/25/07, Ryan McKinley [EMAIL PROTECTED] wrote: On Jan 24, 2007, at 6:53 PM, Ryan McKinley wrote: new SolrPing().process( server ); doesn't server.ping(); look cleaner? I can't argue with you there! I may be taking Hoss's point #4 on https://issues.apache.org/jira/browse/SOLR-20#action_12464641 too seriously. My revised version is designed totally around Requests processing a handler's response. The server does not even know what handlers it may be running - it is just the transport layer between the request and the response. But that may be too big of an abstraction for the simple use case. server.ping() server.add( documents ); server.delete( id ); server.select( options ); ... are obviously cleaner. One form doesn't preclude the other either you can have both. For more complex or custom request handlers, keeping the logic (and library dependencies) out of the server seems like a good idea. -Yonik
Re: [jira] Commented: (SOLR-104) Update Plugins
: ...It is harder to test things that rely on a container, but there are : many techniques to make it easier with mocks... : : FWIW, it's very easy to start Jetty in embedded mode from a Java : class, this can be useful for testing. I'm in favor of both of these things: it would be great if the TestHarness class had some helper methods for using MockHttpServletRequest/Response objects to test the servlets directly in UnitTests, and it would be great if we had an Integration Test phase of our build system that spun up jetty using the example schema, populated it with data, and then tested it over the wire. -Hoss
Re: [jira] Commented: (SOLR-104) Update Plugins
: I'd like to hear from hoss though, since he was following along more : than I was, esp at the start of that marathon thread. FYI: I'm starting to look at this now. i send this mail only because Ryan is so damn fast and I'd like to encourage him to take a break and watch some TV or something so that I can be reasonable sure there won't a new version of the patch for me to look at before i finish looking at the current one :) -Hoss