Solr Query | Loading documents with large content (Performance)

2013-01-15 Thread Uwe Clement
Hi there,



sometimes we have to load very big documents, 1-2 multi-value-fields of it
can contain 10.000 items. And unfortunately we need this informations.



We have to load 50 documents in order to show to the result table in the
UI.



The query takes around 50 seconds. I guess 48 seconds of it, is just to
transfer the content of the documents over the net.



What can I do here?

-I know, can take out this long informations outside of the document.
But this is also not really a solution

-Then I was thinking about compressed-fields. They come with solr 4.1
again, right?



How is it with compressed field. As I understood the stored field will be
stored in a compressed way. Ok, but when they will be uncompressed?

-Before sending back to the client on server-side?

-Or, on the clientside? I am using solrJ.



Any other ideas? Can it work to increase the query performance using
compressed fields?



Thanks a lot for your ideas and answers!



Regards

Uwe







--

  Uwe Clement

  Software Architect

  Project Manager



___ |X__

   X|





eXXcellent solutions gmbh

Beim Alten Fritz 2



D-89075 Ulm



e |  mailto:uwe.clem...@exxcellent.de uwe.clem...@exxcellent.de

m | +49 [0]151-275 692 27

i |  http://www.exxcellent.de http://www.exxcellent.de







Geschäftsführer: Dr. Martina Burgetsmeier, Wilhelm Zorn, Gerhard Gruber
Sitz der Gesellschaft: Ulm, Registergericht: Ulm HRB 4309





AW: Question about dates and SolrJ

2013-01-13 Thread Uwe Clement
In 3.6.1 i also got back a Date insance, now from 4.0 I receive also a
String.

I don't like this, but I adapted my software now.

Is there no way to change this behavior in the config?

-Ursprüngliche Nachricht-
Von: Shawn Heisey [mailto:s...@elyograg.org]
Gesendet: Sonntag, 13. Januar 2013 07.53
An: solr-user@lucene.apache.org
Betreff: Re: Question about dates and SolrJ

On 1/12/2013 7:51 PM, Jack Park wrote:
 My work engages SolrJ, with which I send documents off to Solr 4 which
 properly store, as viewed in the admin panel, as this example:
 2013-02-04T02:11:39.995Z

 When I retrieve a document with that date, I use the SolrDocument
 returned as a MapString,Object in which the date now looks like
 this:
 Sun Feb 03 18:11:39 PST 2013

 I am thinking that I am missing something in the SolrJ configuration,
 though it could be in how I structure the query; for now, here is the
 simplistic way I setup SolrJ:

 HttpSolrServer server = new HttpSolrServer(solrURL);
 server.setParser(new XMLResponseParser())

 Is there something I am missing to retain dates as Solr stores them?

Quick note: setting the parser is NOT necessary unless you are trying to
connect radically different versions of Solr and SolrJ (1.x and 3.x/later,
to be precise), and will in fact make SolrJ slightly slower when
contacting Solr.  Just let it use the default javabin parser -- it's
faster.

If your date field in Solr is an actual date type, then you should be
getting back a Date object in Java which you can manipulate in all the
usual Java ways.  The format that you are seeing matches the toString()
output from a Date object:

http://docs.oracle.com/javase/6/docs/api/java/util/Date.html#toString%28%2
9

You'll almost certainly have to cast the object so it's the right type:

Date dateField = (Date) doc.get(datefieldname);

Thanks,
Shawn



AW: SolrJ | Atomic Updates | How works exactly?

2013-01-13 Thread Uwe Clement
Thanks erick,

the main reason why i want to use atomic updates is, to increase updating
existing kind of large documents.

So if under to cover, everything is the same (loading the whole doc,
updating, re-index the whole doc) it is not interesting for me anymore.

What is the best the most performant way to update a large document?

Any recommendations?

THANKS!

-Ursprüngliche Nachricht-
Von: Erick Erickson [mailto:erickerick...@gmail.com]
Gesendet: Sonntag, 13. Januar 2013 16.53
An: solr-user@lucene.apache.org
Betreff: Re: SolrJ | Atomic Updates | How works exactly?

Atomic updates work by storing (stored=true) all the fields (note, you
don't have to set stored=true for the destinations of copyField).
Anyway, when you use the atomic update syntax under the covers Solr reads
all the stored fields out, re-assembles the document and re-indexes it. So
your index may be significantly larger. Also note that in the 4.1 world,
stored fields are automatically compressed so this may not be so much of a
problem.

And, there's been at least 1 or 2 fixes to this since 4.0 as I remember,
so you might want to wait for 4.1 to experiment with (there's talk of
cutting
RC1 for Solr4.1 early next week) or use a nightly build.

Best
Erick


On Sun, Jan 13, 2013 at 3:43 AM, uwe72 uwe.clem...@exxcellent.de wrote:

 i have very big documents in the index.

 i want to update a multivalue field of a document, without loading the
 whole document.

 how can i do this?

 is there somewhere a good documentation?

 regards



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/SolrJ-Atomic-Updates-How-works-exac
 tly-tp4032976.html Sent from the Solr - User mailing list archive at
 Nabble.com.



AW: SolrJ | Atomic Updates | How works exactly?

2013-01-13 Thread Uwe Clement
Thanks Yonik.

Is this already working well on solr 4.0? or better to wait until solr
4.1?!


-Ursprüngliche Nachricht-
Von: ysee...@gmail.com [mailto:ysee...@gmail.com] Im Auftrag von Yonik
Seeley
Gesendet: Sonntag, 13. Januar 2013 20.24
An: solr-user@lucene.apache.org
Betreff: Re: SolrJ | Atomic Updates | How works exactly?

On Sun, Jan 13, 2013 at 1:51 PM, Uwe Clement uwe.clem...@exxcellent.de
wrote:
 What is the best the most performant way to update a large document?

That *is* the best way to update a large document that we currently have.
Although it re-indexes under the covers, it ensures that it's atomic, and
it's faster because it does everything in a single request.

-Yonik
http://lucidworks.com