You could try Avro instead of JSON/XML/Java Serialization.  It's compact (and 
new).

http://hadoop.apache.org/avro/

 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----
> From: Paul Taylor <paul_t...@fastmail.fm>
> To: java-user@lucene.apache.org
> Sent: Tue, January 5, 2010 7:44:21 AM
> Subject: Performance Results on changing the way fields are stored
> 
> So currently in my index I index and store a number of small fields, I need 
> both 
> so I can search on the fields, then I use the stored versions to generate the 
> output document (which is either an XML or JSON representation), because I 
> read 
> stored and index fields are dealt with completely seperately I tried another 
> tact only storing one field which was a serialized version of the output 
> documentation. This solves a couple of issues I was having but I was 
> disappointed that both the size of the index increased and the index build  
> time 
> increased, I thought that if all the stored data was held in one field that 
> the 
> resultant index would be smaller, and I didn't expect index time to increase 
> by 
> as much as it did. I was also suprised that Java serilaization was slower and 
> used more space than both JSON and XML serialization.
> 
> Results as Follows
> 
> Type:                                                             Time : 
> Index 
> Size
> Only indexed  no norms                                                        
>   
>           105   : 38 MB
> Only indexed                                                                  
>   
>                  111   : 43 MB
> Same fields written as Indexed and Stored  (current Situation)           115  
>  : 
> 83 MB
> Fields Indexed, One JAXB classed Stored using JSON Marshalling 140   : 115 MB
> Fields Indexed, One JAXB classed Stored using XML Marshalling  189   : 198 MB
> Fields Indexed, One JAXB classed Stored using Java Serialization   305   : 
> 485 
> MB
> 
> Are these results to be expected, could anybody suggest anything else I could 
> do
> 
> 
> Paul
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to