Jonathan, Thank you for the info. I made the necessary mod in hbase-default.xml
I increased the allowed keyvalue max from 10 Meg (10,485,760, to be precise - wonder where that comes from) to 50 Meg. I then restarted Hbase. Now my code runs fine - the 12 Meg string goes into the cell with no error reported. BTW - I do intend to keep my cell contents to a reasonable size, but on rare occasions I might need a loose limit, so I set it at 50 Meg so I don't have to worry. (Also aware that I can set it at zero or less to disable the limit check entirely, reading from the instructions.) Thanks to both you and Ryan for the assistance. Ron ___________________________________________ Ronald Taylor, Ph.D. Computational Biology & Bioinformatics Group Pacific Northwest National Laboratory 902 Battelle Boulevard P.O. Box 999, Mail Stop J4-33 Richland, WA 99352 USA Office: 509-372-6568 Email: [email protected] -----Original Message----- From: Jonathan Gray [mailto:[email protected]] Sent: Sunday, October 03, 2010 9:24 PM To: [email protected] Subject: RE: How do you increase the max cell size in Hbase? - more info on the error seen Sorry, forgot to respond to this earlier. The configuration parameter you need to change is 'hbase.client.keyvalue.maxsize' Setting it to 0 will remove the limit. Let me know if this does not help. > -----Original Message----- > From: Taylor, Ronald C [mailto:[email protected]] > Sent: Sunday, October 03, 2010 8:57 PM > To: 'Ryan Rawson'; [email protected]; [email protected] > Cc: Taylor, Ronald C; 'Ronald Taylor' > Subject: RE: How do you increase the max cell size in Hbase? - more > info on the error seen > > > Hi Ryan, > > I put a > e.printStackTrace() > > in the catch clause of the method where the error occurs. Here is what > I get reported back: > > e = 'java.lang.IllegalArgumentException: KeyValue size too large' > > java.lang.IllegalArgumentException: KeyValue size too large > at > org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:688) > at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:544) > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:535) > at > putPeptideDataIntoHBase.write_rnaSeqCountData_Into_rnaSeqCountTable(pu > t > PeptideDataIntoHBase.java:6235) > at > putPeptideDataIntoHBase.inputRNAseqCountData(putPeptideDataIntoHBase.j > a > va:5773) > at > putPeptideDataIntoHBase.invoke(putPeptideDataIntoHBase.java:1000) > at > putPeptideDataIntoHBase.main(putPeptideDataIntoHBase.java:765) > > So - looks like that the error is occurring in the validatePut method. > > I am using Hbase ver 0.89.20100726 on a 25-node cluster running Hadoop > 0.20.2. > > OS info is as follows: > > [rtay...@h01 hbase]$ uname -a > Linux h01.emsl.pnl.gov 2.6.18-194.11.1.el5 #1 SMP Tue Jul 27 05:45:06 > EDT 2010 x86_64 x86_64 x86_64 GNU/Linux > [rtay...@h01 hbase]$ > [rtay...@h01 hbase]$ more /etc/redhat-release Red Hat Enterprise Linux > Client release 5.5 (Tikanga) > [rtay...@h01 hbase]$ > > Also: I just tried cutting the string down to 8.0 Meg instad of down > to 5.0 Meg. That also works. So - the problem starts to occur > somewhere between a string size of 8 and ~12.5 Meg. > > Ron > > ___________________________________________ > Ronald Taylor, Ph.D. > Computational Biology & Bioinformatics Group Pacific Northwest > National Laboratory > 902 Battelle Boulevard > P.O. Box 999, Mail Stop J4-33 > Richland, WA 99352 USA > Office: 509-372-6568 > Email: [email protected] > > -----Original Message----- > From: Ryan Rawson [mailto:[email protected]] > Sent: Sunday, October 03, 2010 12:23 AM > To: Taylor, Ronald C > Cc: [email protected]; [email protected] > Subject: Re: How do you increase the max cell size in Hbase? > > What version of HBase are you using? Looking in the HBase source > code, in the likely place (KeyValue.createByteArray), the exceptions > look more like yay so: > > throw new IllegalArgumentException("Row > " + Short.MAX_VALUE); > > Perhaps you could identify the call stack and log that? That would > help a lot, also the message suggests it might not come from HBase > code... > > -ryan > > On Sat, Oct 2, 2010 at 11:35 PM, Taylor, Ronald C > <[email protected]> wrote: > > > > FYI - there is nothing in the string itself that would cause an > error. It's just a concatenated list of integer values, with colon > separators between the numbers. I've checked it visually - and, yep, > the string is indeed a simple ASCII list of integers with colon > delimiters. Nothing weird in it. So the problem is something in regard > to the length - not the string contents. > > > > Ron > > > > -----Original Message----- > > From: Taylor, Ronald C > > Sent: Saturday, October 02, 2010 11:25 PM > > To: 'Ryan Rawson'; [email protected] > > Cc: Taylor, Ronald C > > Subject: RE: How do you increase the max cell size in Hbase? > > > > > > Ryan, > > > > I just tried chopping the string to a max of 5 Meg, down from about > 12 Meg at its largest, and the insertions appear to work fine. I do a > scan afterwards and the fields and their contents appear to be all > there. So - it would appear that I'm violating *some* length limit, > somewhere, when I use the full 12 Meg string. > > > > Here's the relevant code for the version that worked, with the > replacement of the full string with the 5 Meg string instead: > > > > rowID = CURRENT_SPECIES_ID_ABBREV + "_" + RNASEQ_RUNID + > > "_chromo_pos_strand"; > > p = new Put(Bytes.toBytes(rowID)); > > > > chromo_pos_strand_counts = buf.toString(); > > > > System.out.println("size of chromo_pos_strand_counts = " + > > chromo_pos_strand_counts.length()); > > System.out.println("writing chromo positive strand data to > > table ...\n\n"); > > > > temp = chromo_pos_strand_counts.substring(0,5000000); > > // p.add(Bytes.toBytes(colFamily), > > // > > > Bytes.toBytes("Chromo_Positive_Strand_rnaSeq_Counts"),Bytes.toBytes(ch > > romo_pos_strand_counts)); > > > > > p.add(Bytes.toBytes(colFamily),Bytes.toBytes("Chromo_Positive_Strand_r > > naSeq_Counts"),Bytes.toBytes(temp)); > > > > > > > p.add(Bytes.toBytes(colFamily),Bytes.toBytes("Strand"),Bytes.toBytes(" > > positive")); > > > > > p.add(Bytes.toBytes(colFamily),Bytes.toBytes("Source"),Bytes.toBytes(" > > chromosome")); > > > > > p.add(Bytes.toBytes(colFamily),Bytes.toBytes("Species"),Bytes.toBytes( > > CURRENT_SPECIES_ID)); > > > > > p.add(Bytes.toBytes(colFamily),Bytes.toBytes("SpeciesAbbrev"),Bytes.to > > Bytes(CURRENT_SPECIES_ID_ABBREV)); > > > > > p.add(Bytes.toBytes(colFamily),Bytes.toBytes("RunID"),Bytes.toBytes(RN > > ASEQ_RUNID)); > > rnaSeqCountTable.put(p); > > > > > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > > > > Going back to using the full string, I get this error msg: > > > > size of chromo_pos_strand_counts = 12897748 writing chromo positive > strand data to table ... > > > > Exception java.lang.IllegalArgumentException: KeyValue size too > > large in write_rnaSeqCountData_Into_rnaSeqCountTable() > > > > e = 'java.lang.IllegalArgumentException: KeyValue size too large' > > > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > > > > Any thoughts? > > > > Ron > > > > ___________________________________________ > > Ronald Taylor, Ph.D. > > Computational Biology & Bioinformatics Group Pacific Northwest > > National Laboratory > > 902 Battelle Boulevard > > P.O. Box 999, Mail Stop J4-33 > > Richland, WA 99352 USA > > Office: 509-372-6568 > > Email: [email protected] > > > > -----Original Message----- > > From: Ryan Rawson [mailto:[email protected]] > > Sent: Saturday, October 02, 2010 11:07 PM > > To: Taylor, Ronald C > > Cc: [email protected] > > Subject: Re: How do you increase the max cell size in Hbase? > > > > Hey, > > > > The max sizes of these things are determined by how many bits we use > to specify lengths in the file format. Changing it isn't an option > because the file format depends on it and done by code. > > > > However you shouldn't be hitting limits based on the data you told > us... perhaps if you could paste the exception backtrace it might > indicate something useful. > > > > -ryan > > > > On Sat, Oct 2, 2010 at 10:56 PM, Taylor, Ronald C > <[email protected]> wrote: > >> > >> Hi Ryan, > >> > >> Well, the key is only 50 chars or so, and the data in the cell is > >> about 12 Meg, in one string. So - obviously this is way less than > the > >> 2 Gb limit for each. Does this have anything to do with the row > >> length, which is supposed to be kept to less than, from what you > >> say below, > >> Short.MAX_LENGTH ? > >> > >> I do not see MAX_LENGTH being set anywhere in the conf files (just > did a grep), so - what is the default row max length, and where would > I reset it, if needed? > >> > >> And if that's *not* the problem, any other possibilities for an > error > >> msg of > >> KeyValue size too large > >> > >> being given on a rather prosaic Put insertion? > >> > >> Ron > >> > >> -----Original Message----- > >> From: Ryan Rawson [mailto:[email protected]] > >> Sent: Saturday, October 02, 2010 10:42 PM > >> To: [email protected] > >> Cc: Taylor, Ronald C > >> Subject: Re: How do you increase the max cell size in Hbase? > >> > >> Hey, > >> > >> The limits are due to code/data limits, eg: how many bits of space > we use to indicate lengths and such. This is something like ~ 2gb for > the "key" part and the "value" part each. Furthermore the row can > only be Short.MAX_LENGTH. > >> > >> There is specific exceptions for each one of these in trunk/0.89, > >> do > you have a specific exception text? > >> > >> -ryan > >> > >> On Sat, Oct 2, 2010 at 10:31 PM, Taylor, Ronald C > <[email protected]> wrote: > >>> > >>> Hello, > >>> > >>> I would like to increase the max cell size in one of my Hbase > tables. > >>> Just got an error msg when trying to insert something about 12 Meg > >>> in size that said > >>> > >>> KeyValue size too large > >>> > >>> I presume that I'm using the default cell max size at present - I > cannot find anything regarding cell size in the conf files, so the > default setting must be being used. > >>> > >>> How do I increase the max size allowed? For example, if I want to > allow a string of up to 20 Meg in size, which conf file do I change, > and what is the precise wording? > >>> > >>> I googled around and found a note on MAX_LENGTH, but it is unclear > >>> how to set it and where. Do I do something like > >>> > >>> MAX_LENGTH=20 > >>> > >>> in the conf//hbase-env.sh file? > >>> > >>> And, if that works, do I need to restart Hbase and then recreate > all the tables in which I want to use the new max cell size? > >>> > >>> Cheers, > >>> Ron > >>> > >>> ___________________________________________ > >>> Ronald Taylor, Ph.D. > >>> Computational Biology & Bioinformatics Group Pacific Northwest > >>> National Laboratory > >>> 902 Battelle Boulevard > >>> P.O. Box 999, Mail Stop J4-33 > >>> Richland, WA 99352 USA > >>> Office: 509-372-6568 > >>> Email: [email protected] > >>> > >>> > >>> > >> > >
