Hi, You could consider storing date field as String in "YYYYMMDD" format. This will save space and it will perform better.
Regards Aditya www.findbestopensource.com On Thu, Feb 23, 2012 at 11:55 AM, Jason Toy <jason...@gmail.com> wrote: > I have a solr instance with about 400m docs. For text searches it is > perfectly fine. When I do searches that calculate the amount of times a > word appeared in the doc set for every day of a month, it usually causes > solr to crash with out of memory errors. > I calculate this by running ~30 queries, one for each day to see the > count for that day. > Is there a better way I could do this? > > Currently the date fields are stored as: > <fieldType name="date" class="solr.TrieDateField" omitNorms="true" > precisionStep="0" positionIncrementGap="0"/> > > and the timestamps are stored in the format of: > 2012-02-22T21:11:14Z > > We have no need to store anything beyond the date. Will just changing the > time portion to zeros make things faster: > 2012-02-22T00:00:00Z > > I thought that to optimize this, there would be an actual date type that > doesnt store the time component, but looking through the solr docs, I don't > see anything specifically for a date as opposed to a timestamp. Would it > be faster for me to store dates in an sint format? What is the optimal > format I should use? If the format is to continue to use TrieDateField, is > it not a waste to store the hour/minute/seconds even if they are not being > used? > > Is there anything else I can do to make this more efficient? > > I have looked around on the mailing list and on google and not sure what > to use, I would appreciate any pointers. Thanks. > > Jason > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >