It turns out that computing this facet only takes about 12MB, but the fielddata cache was completely full. Restarting the nodes emptied the cache, and everything started working again.
I note that there's a setting: indices.fielddata.cache.expire Which is off by default. I guess I need to set that to something sensible and see what happens. What's the reason for it being off by default? Thanks Seb On Friday, 25 July 2014 11:51:03 UTC+1, Seb Bacon wrote: > > OK, so it turns out the GET version just wasn't getting parsed at all. > > curl -XPOST -G http://localhost:9200/bork/user/_search -d ' > something-nonsense' > > Always returns everything; the parameters have to be in the form key=val > when in the URL. The docs do already say that; I was being misled by the > behaviour of the elasticsearch-head plugin, which I assumed was doing the > right thing with JSON. > > Back to the drawing board... I'm back to my original assumption (before > this red herring) that the issue is because the query is faceting across > the entire dataset, which is simply too big. > > My assumption was that my including the type in the URL the faceting would > only happen across that type (which only has 101 records), but I suppose > this is not the case...? > > Thanks > > Seb > > > > > > On Friday, 25 July 2014 10:41:59 UTC+1, Seb Bacon wrote: >> >> Hi, >> >> I've got a search query which fails with "CircuitBreakingException: Data >> too large" when POSTed, but succeeds when the identical query is sent as a >> GET (with the json in the query string). >> >> The search query itself may be buggy, as far as I can tell (the "size" >> parameter is in the wrong place). But the different behaviour between the >> two test cases is the bug I'm interested in. >> >> This is on version 1.0.3, with two nodes. Presumably something is causing >> too many fieldvalues to be loaded into memory in the POST version (note >> "nested: QueryPhaseExecutionException" in the error output). I can't >> reproduce locally with a small dataset, only in production with a 10G >> index. I guess I would have to create a very large test dataset first (and >> maybe set the circuit breaker settings low?), but I've run out of time for >> debugging it this morning and thought I'd see if this was a known issue >> first. I thought the "nested" message might be a meaningful clue to someone >> who knows more about it. >> >> Gist here: https://gist.github.com/sebbacon/7b5e67aaae7f0e0a31aa >> >> Thanks >> >> Seb >> > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d7270870-7a3c-4141-b70f-5de5d714ee84%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
