FYI - there is nothing in the string itself that would cause an error. It's just a concatenated list of integer values, with colon separators between the numbers. I've checked it visually - and, yep, the string is indeed a simple ASCII list of integers with colon delimiters. Nothing weird in it. So the problem is something in regard to the length - not the string contents.
Ron -----Original Message----- From: Taylor, Ronald C Sent: Saturday, October 02, 2010 11:25 PM To: 'Ryan Rawson'; [email protected] Cc: Taylor, Ronald C Subject: RE: How do you increase the max cell size in Hbase? Ryan, I just tried chopping the string to a max of 5 Meg, down from about 12 Meg at its largest, and the insertions appear to work fine. I do a scan afterwards and the fields and their contents appear to be all there. So - it would appear that I'm violating *some* length limit, somewhere, when I use the full 12 Meg string. Here's the relevant code for the version that worked, with the replacement of the full string with the 5 Meg string instead: rowID = CURRENT_SPECIES_ID_ABBREV + "_" + RNASEQ_RUNID + "_chromo_pos_strand"; p = new Put(Bytes.toBytes(rowID)); chromo_pos_strand_counts = buf.toString(); System.out.println("size of chromo_pos_strand_counts = " + chromo_pos_strand_counts.length()); System.out.println("writing chromo positive strand data to table ...\n\n"); temp = chromo_pos_strand_counts.substring(0,5000000); // p.add(Bytes.toBytes(colFamily), // Bytes.toBytes("Chromo_Positive_Strand_rnaSeq_Counts"),Bytes.toBytes(chromo_pos_strand_counts)); p.add(Bytes.toBytes(colFamily),Bytes.toBytes("Chromo_Positive_Strand_rnaSeq_Counts"),Bytes.toBytes(temp)); p.add(Bytes.toBytes(colFamily),Bytes.toBytes("Strand"),Bytes.toBytes("positive")); p.add(Bytes.toBytes(colFamily),Bytes.toBytes("Source"),Bytes.toBytes("chromosome")); p.add(Bytes.toBytes(colFamily),Bytes.toBytes("Species"),Bytes.toBytes(CURRENT_SPECIES_ID)); p.add(Bytes.toBytes(colFamily),Bytes.toBytes("SpeciesAbbrev"),Bytes.toBytes(CURRENT_SPECIES_ID_ABBREV)); p.add(Bytes.toBytes(colFamily),Bytes.toBytes("RunID"),Bytes.toBytes(RNASEQ_RUNID)); rnaSeqCountTable.put(p); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Going back to using the full string, I get this error msg: size of chromo_pos_strand_counts = 12897748 writing chromo positive strand data to table ... Exception java.lang.IllegalArgumentException: KeyValue size too large in write_rnaSeqCountData_Into_rnaSeqCountTable() e = 'java.lang.IllegalArgumentException: KeyValue size too large' %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Any thoughts? Ron ___________________________________________ Ronald Taylor, Ph.D. Computational Biology & Bioinformatics Group Pacific Northwest National Laboratory 902 Battelle Boulevard P.O. Box 999, Mail Stop J4-33 Richland, WA 99352 USA Office: 509-372-6568 Email: [email protected] -----Original Message----- From: Ryan Rawson [mailto:[email protected]] Sent: Saturday, October 02, 2010 11:07 PM To: Taylor, Ronald C Cc: [email protected] Subject: Re: How do you increase the max cell size in Hbase? Hey, The max sizes of these things are determined by how many bits we use to specify lengths in the file format. Changing it isn't an option because the file format depends on it and done by code. However you shouldn't be hitting limits based on the data you told us... perhaps if you could paste the exception backtrace it might indicate something useful. -ryan On Sat, Oct 2, 2010 at 10:56 PM, Taylor, Ronald C <[email protected]> wrote: > > Hi Ryan, > > Well, the key is only 50 chars or so, and the data in the cell is > about 12 Meg, in one string. So - obviously this is way less than the > 2 Gb limit for each. Does this have anything to do with the row > length, which is supposed to be kept to less than, from what you say > below, > Short.MAX_LENGTH ? > > I do not see MAX_LENGTH being set anywhere in the conf files (just did a > grep), so - what is the default row max length, and where would I reset it, > if needed? > > And if that's *not* the problem, any other possibilities for an error > msg of > KeyValue size too large > > being given on a rather prosaic Put insertion? > > Ron > > -----Original Message----- > From: Ryan Rawson [mailto:[email protected]] > Sent: Saturday, October 02, 2010 10:42 PM > To: [email protected] > Cc: Taylor, Ronald C > Subject: Re: How do you increase the max cell size in Hbase? > > Hey, > > The limits are due to code/data limits, eg: how many bits of space we use to > indicate lengths and such. This is something like ~ 2gb for the "key" part > and the "value" part each. Furthermore the row can only be Short.MAX_LENGTH. > > There is specific exceptions for each one of these in trunk/0.89, do you have > a specific exception text? > > -ryan > > On Sat, Oct 2, 2010 at 10:31 PM, Taylor, Ronald C <[email protected]> > wrote: >> >> Hello, >> >> I would like to increase the max cell size in one of my Hbase tables. >> Just got an error msg when trying to insert something about 12 Meg in >> size that said >> >> KeyValue size too large >> >> I presume that I'm using the default cell max size at present - I cannot >> find anything regarding cell size in the conf files, so the default setting >> must be being used. >> >> How do I increase the max size allowed? For example, if I want to allow a >> string of up to 20 Meg in size, which conf file do I change, and what is the >> precise wording? >> >> I googled around and found a note on MAX_LENGTH, but it is unclear >> how to set it and where. Do I do something like >> >> MAX_LENGTH=20 >> >> in the conf//hbase-env.sh file? >> >> And, if that works, do I need to restart Hbase and then recreate all the >> tables in which I want to use the new max cell size? >> >> Cheers, >> Ron >> >> ___________________________________________ >> Ronald Taylor, Ph.D. >> Computational Biology & Bioinformatics Group Pacific Northwest >> National Laboratory >> 902 Battelle Boulevard >> P.O. Box 999, Mail Stop J4-33 >> Richland, WA 99352 USA >> Office: 509-372-6568 >> Email: [email protected] >> >> >> >
