On Jan 5, 2010, at 7:44 AM, Paul Taylor wrote: > So currently in my index I index and store a number of small fields, I need > both so I can search on the fields, then I use the stored versions to > generate the output document (which is either an XML or JSON representation), > because I read stored and index fields are dealt with completely seperately I > tried another tact only storing one field which was a serialized version of > the output documentation. This solves a couple of issues I was having but I > was disappointed that both the size of the index increased and the index > build time increased, I thought that if all the stored data was held in one > field that the resultant index would be smaller, and I didn't expect index > time to increase by as much as it did. I was also suprised that Java > serilaization was slower and used more space than both JSON and XML > serialization. > > Results as Follows > > Type: Time : > Index Size > Only indexed no norms > 105 : 38 MB > Only indexed > 111 : 43 MB > Same fields written as Indexed and Stored (current Situation) 115 > : 83 MB > Fields Indexed, One JAXB classed Stored using JSON Marshalling 140 : 115 MB > Fields Indexed, One JAXB classed Stored using XML Marshalling 189 : 198 MB > Fields Indexed, One JAXB classed Stored using Java Serialization 305 : > 485 MB
How much more verbose are these than the "raw" content? Even as terse as JSON is, it is still verbose compared to a binary format, and XML Marshalling and Java Serialization will be even more. Given that you are likely only displaying 10 or so at a time, I'd think it would be much more efficient to only store the minimal amount needed to recreate the docs in the current result set. I've also seen people have success simply storing a key in Lucene that is then used for lookup in something like Memcachedb, Tokyo Cabinet or one of the many other key-value stores. -Grant -------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org