Solr Query | Loading documents with large content (Performance)
Hi there, sometimes we have to load very big documents, 1-2 multi-value-fields of it can contain 10.000 items. And unfortunately we need this informations. We have to load 50 documents in order to show to the result table in the UI. The query takes around 50 seconds. I guess 48 seconds of it, is just to transfer the content of the documents over the net. What can I do here? -I know, can take out this long informations outside of the document. But this is also not really a solution -Then I was thinking about compressed-fields. They come with solr 4.1 again, right? How is it with compressed field. As I understood the stored field will be stored in a compressed way. Ok, but when they will be uncompressed? -Before sending back to the client on server-side? -Or, on the clientside? I am using solrJ. Any other ideas? Can it work to increase the query performance using compressed fields? Thanks a lot for your ideas and answers! Regards Uwe -- Uwe Clement Software Architect Project Manager ___ |X__ X| eXXcellent solutions gmbh Beim Alten Fritz 2 D-89075 Ulm e | mailto:uwe.clem...@exxcellent.de uwe.clem...@exxcellent.de m | +49 [0]151-275 692 27 i | http://www.exxcellent.de http://www.exxcellent.de Geschäftsführer: Dr. Martina Burgetsmeier, Wilhelm Zorn, Gerhard Gruber Sitz der Gesellschaft: Ulm, Registergericht: Ulm HRB 4309
AW: Question about dates and SolrJ
In 3.6.1 i also got back a Date insance, now from 4.0 I receive also a String. I don't like this, but I adapted my software now. Is there no way to change this behavior in the config? -Ursprüngliche Nachricht- Von: Shawn Heisey [mailto:s...@elyograg.org] Gesendet: Sonntag, 13. Januar 2013 07.53 An: solr-user@lucene.apache.org Betreff: Re: Question about dates and SolrJ On 1/12/2013 7:51 PM, Jack Park wrote: My work engages SolrJ, with which I send documents off to Solr 4 which properly store, as viewed in the admin panel, as this example: 2013-02-04T02:11:39.995Z When I retrieve a document with that date, I use the SolrDocument returned as a MapString,Object in which the date now looks like this: Sun Feb 03 18:11:39 PST 2013 I am thinking that I am missing something in the SolrJ configuration, though it could be in how I structure the query; for now, here is the simplistic way I setup SolrJ: HttpSolrServer server = new HttpSolrServer(solrURL); server.setParser(new XMLResponseParser()) Is there something I am missing to retain dates as Solr stores them? Quick note: setting the parser is NOT necessary unless you are trying to connect radically different versions of Solr and SolrJ (1.x and 3.x/later, to be precise), and will in fact make SolrJ slightly slower when contacting Solr. Just let it use the default javabin parser -- it's faster. If your date field in Solr is an actual date type, then you should be getting back a Date object in Java which you can manipulate in all the usual Java ways. The format that you are seeing matches the toString() output from a Date object: http://docs.oracle.com/javase/6/docs/api/java/util/Date.html#toString%28%2 9 You'll almost certainly have to cast the object so it's the right type: Date dateField = (Date) doc.get(datefieldname); Thanks, Shawn
AW: SolrJ | Atomic Updates | How works exactly?
Thanks erick, the main reason why i want to use atomic updates is, to increase updating existing kind of large documents. So if under to cover, everything is the same (loading the whole doc, updating, re-index the whole doc) it is not interesting for me anymore. What is the best the most performant way to update a large document? Any recommendations? THANKS! -Ursprüngliche Nachricht- Von: Erick Erickson [mailto:erickerick...@gmail.com] Gesendet: Sonntag, 13. Januar 2013 16.53 An: solr-user@lucene.apache.org Betreff: Re: SolrJ | Atomic Updates | How works exactly? Atomic updates work by storing (stored=true) all the fields (note, you don't have to set stored=true for the destinations of copyField). Anyway, when you use the atomic update syntax under the covers Solr reads all the stored fields out, re-assembles the document and re-indexes it. So your index may be significantly larger. Also note that in the 4.1 world, stored fields are automatically compressed so this may not be so much of a problem. And, there's been at least 1 or 2 fixes to this since 4.0 as I remember, so you might want to wait for 4.1 to experiment with (there's talk of cutting RC1 for Solr4.1 early next week) or use a nightly build. Best Erick On Sun, Jan 13, 2013 at 3:43 AM, uwe72 uwe.clem...@exxcellent.de wrote: i have very big documents in the index. i want to update a multivalue field of a document, without loading the whole document. how can i do this? is there somewhere a good documentation? regards -- View this message in context: http://lucene.472066.n3.nabble.com/SolrJ-Atomic-Updates-How-works-exac tly-tp4032976.html Sent from the Solr - User mailing list archive at Nabble.com.
AW: SolrJ | Atomic Updates | How works exactly?
Thanks Yonik. Is this already working well on solr 4.0? or better to wait until solr 4.1?! -Ursprüngliche Nachricht- Von: ysee...@gmail.com [mailto:ysee...@gmail.com] Im Auftrag von Yonik Seeley Gesendet: Sonntag, 13. Januar 2013 20.24 An: solr-user@lucene.apache.org Betreff: Re: SolrJ | Atomic Updates | How works exactly? On Sun, Jan 13, 2013 at 1:51 PM, Uwe Clement uwe.clem...@exxcellent.de wrote: What is the best the most performant way to update a large document? That *is* the best way to update a large document that we currently have. Although it re-indexes under the covers, it ensures that it's atomic, and it's faster because it does everything in a single request. -Yonik http://lucidworks.com