Update: after checking the Java-client and my bucket code I noticed that I am doing the following:
val bucket = DB.client.createBucket(bucketName).enableForSearch().execute() I have a feeling that “enableForSearch” is causing each object in its entirety to be indexed, instead of the explicit fields. Checking this Javadoc http://basho.github.io/riak-java-client/1.0.5/com/basho/riak/client/bucket/WriteBucket.html shows that search=true is written for the bucket. Would that cause the entire object to be indexed, not just the explicit fields? On Nov 24, 2013, at 3:02 PM, Justin Long <[email protected]> wrote: > Interesting, as I mentioned previously any objects in the “collector” buckets > all share the same structure. It is trying to index a field that is actually > inside a String. “data_followers” is a key inside the “data” map and the > value for that key is escaped JSON (it goes into offline processing FYI > later). And as you can see there isn’t any indexing set for that field. > > > On Nov 24, 2013, at 2:56 PM, Joe Caswell <[email protected]> wrote: > >> Justin, >> >> The binary in the log entry below equates to: >> {<<"collector-collect-twitter">>,<<"data_followers">>,<<32897-byte string>>} >> >> Hope this helps. >> >> Joe >> From: Justin Long <[email protected]> >> Date: Sunday, November 24, 2013 5:17 PM >> To: Joe Caswell <[email protected]> >> Cc: Richard Shaw <[email protected]>, riak-users <[email protected]> >> Subject: Re: Runaway "Failed to compact" errors >> >> Thanks Joe. I would agree that would probably be the problem. I am concerned >> since none of the fields of objects I am storing in Riak would produce a key >> larger than 32kb. Here’s a sample Scala (Java-based) POJO that represents an >> object in the problem bucket using the Riak-Java-Client: >> >> case class InstagramCache( >> @(JsonProperty@field)("identityId") >> @(RiakKey@field) >> val identityId: String, // ID of user on social network >> >> @(JsonProperty@field)("userId") >> @(RiakIndex@field)(name = "userId") >> val userId: String, // associated user ID on platform >> >> @(JsonProperty@field)("data") >> val data: Map[String, Option[String]], >> >> @(JsonProperty@field)("updated") >> var updated: Date >> >> ) >> >> The fields identityId and userId would rarely exceed 30 characters. Is Riak >> trying to index the whole object? >> >> Thanks >> >> >> >> On Nov 24, 2013, at 2:11 PM, Joe Caswell <[email protected]> wrote: >> >>> Justin, >>> >>> The terms being stored in merge index are too large. The maximum size for >>> an {Index, Field, Term} key is 32k bytes. >>> The binary blob in your log entry represents a tuple that was 32952 bytes. >>> Since merge index uses a 15-bit integer to store term size, if the >>> term_to_binary of the given key is larger than 32767, high bits are lost, >>> effectively storing (<large size> mod 32767) bytes. >>> When this data is read back, binary_to_term is unable to reconstruct the >>> key due the missing bytes, and throws a badarg exception. >>> >>> Search index repair is document here: >>> http://docs.basho.com/riak/1.4.0/cookbooks/Repairing-Search-Indexes/ >>> However, you would need to first modify your extractor to not produce >>> search keys larger than 32k or the corruption issues will recur. >>> >>> Joe Caswell >>> >>> >>> From: Richard Shaw <[email protected]> >>> Date: Sunday, November 24, 2013 4:25 PM >>> To: Justin Long <[email protected]> >>> Cc: riak-users <[email protected]> >>> Subject: Re: Runaway "Failed to compact" errors >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
