Erick, On Thu, May 9, 2013 at 10:28 PM, Erick Erickson <erickerick...@gmail.com> wrote: > Yeah, this is getting warmer. Some of the > CompressingStoredFieldsReader objects are 240M. The documents aren't > nearly that large I don't think (but I'll verify).
Wow, 240M is a lot! Would you be able to know which member of these instances is making them so large? Since you were mentioning threading, I first thought that the culprit was the buffer used for decompression but it could be the fields index as well (which is stored in memory but should be shared across instances for the same segment, so threading shouldn't make the situation worse). > But still, over 700 of these objects live at once? I _think_ I'm > seeing the number go up significantly when the number of indexing > threads increases, but that's a bit of indirect evidence. My other > question would be whether you'd expect the number of these objects to > go up as the number of segments goes up, i.e. I assume they're > per-segment.... Indeed, these objects are managed per-segment, so increasing the number of segments increases the number of stored fields reader instances. > So the pattern here is atomic updates on documents where some of the > fields get quite large. So the underlying reader has to decompress > these a lot. Do you have any suggestions how to mitigate this? Other > than "don't do that<G>".... As Savia suggests, lowering the number of indexing threads should help. If the merge factor is highish, lowering it could help as well. On my end, I'll fix the stored fields reader to not reuse the buffer for decompression (https://issues.apache.org/jira/browse/LUCENE-4995, even if we find out that it is not the problem here, it shouldn't hurt). -- Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org