Interesting, as I mentioned previously any objects in the “collector” buckets all share the same structure. It is trying to index a field that is actually inside a String. “data_followers” is a key inside the “data” map and the value for that key is escaped JSON (it goes into offline processing FYI later). And as you can see there isn’t any indexing set for that field.
On Nov 24, 2013, at 2:56 PM, Joe Caswell <[email protected]> wrote: > Justin, > > The binary in the log entry below equates to: > {<<"collector-collect-twitter">>,<<"data_followers">>,<<32897-byte string>>} > > Hope this helps. > > Joe > From: Justin Long <[email protected]> > Date: Sunday, November 24, 2013 5:17 PM > To: Joe Caswell <[email protected]> > Cc: Richard Shaw <[email protected]>, riak-users <[email protected]> > Subject: Re: Runaway "Failed to compact" errors > > Thanks Joe. I would agree that would probably be the problem. I am concerned > since none of the fields of objects I am storing in Riak would produce a key > larger than 32kb. Here’s a sample Scala (Java-based) POJO that represents an > object in the problem bucket using the Riak-Java-Client: > > case class InstagramCache( > @(JsonProperty@field)("identityId") > @(RiakKey@field) > val identityId: String, // ID of user on social network > > @(JsonProperty@field)("userId") > @(RiakIndex@field)(name = "userId") > val userId: String, // associated user ID on platform > > @(JsonProperty@field)("data") > val data: Map[String, Option[String]], > > @(JsonProperty@field)("updated") > var updated: Date > > ) > > The fields identityId and userId would rarely exceed 30 characters. Is Riak > trying to index the whole object? > > Thanks > > > > On Nov 24, 2013, at 2:11 PM, Joe Caswell <[email protected]> wrote: > >> Justin, >> >> The terms being stored in merge index are too large. The maximum size for an >> {Index, Field, Term} key is 32k bytes. >> The binary blob in your log entry represents a tuple that was 32952 bytes. >> Since merge index uses a 15-bit integer to store term size, if the >> term_to_binary of the given key is larger than 32767, high bits are lost, >> effectively storing (<large size> mod 32767) bytes. >> When this data is read back, binary_to_term is unable to reconstruct the key >> due the missing bytes, and throws a badarg exception. >> >> Search index repair is document here: >> http://docs.basho.com/riak/1.4.0/cookbooks/Repairing-Search-Indexes/ >> However, you would need to first modify your extractor to not produce >> search keys larger than 32k or the corruption issues will recur. >> >> Joe Caswell >> >> >> From: Richard Shaw <[email protected]> >> Date: Sunday, November 24, 2013 4:25 PM >> To: Justin Long <[email protected]> >> Cc: riak-users <[email protected]> >> Subject: Re: Runaway "Failed to compact" errors
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
