bq. could I set the hbase.client.keyvalue.maxsize to a larger value? Yes, you can.
bq. to check the size after compression HTable doesn't perform compression. It performs validation on Put's. Where do you suggest the validation to be performed ? Cheers On Wed, May 28, 2014 at 7:26 PM, Henry Hung <[email protected]> wrote: > Hi All, > > Today I stumble upon this error: > > Error: java.io.IOException: java.io.IOException: > java.lang.IllegalArgumentException: KeyValue size too large > at > com.winbond.hadoop.fdc.mapreduce.xml.XmlToHBaseMapper.map(XmlToHBaseMapper.java:204) > at > com.winbond.hadoop.fdc.mapreduce.xml.XmlToHBaseMapper.map(XmlToHBaseMapper.java:1) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157) > > After looking into the source code, it appears this is a constraint impose > by hbase.client.keyvalue.maxsize = 10485760. > > From the parameter description: > Specifies the combined maximum allowed size of a KeyValue instance. This > is to set an upper boundary for a single entry saved in a storage file. > Since they cannot be split it helps avoiding that a region cannot be split > any further because the data is too large. It seems wise to set this to a > fraction of the maximum region size. Setting it to zero or less disables > the check. > > If I set the table's region size to be maximum 20GB before splitting, > could I set the hbase.client.keyvalue.maxsize to a larger value? Such as: > 200 MB > > One more thing, when I looked into HTable.java source code, it appears > that the key value size is checked before compression, is this true? > I think it should be more reasonable to check the size after compression, > no? > > Best regards, > Henry Hung > > ________________________________ > The privileged confidential information contained in this email is > intended for use only by the addressees as indicated by the original sender > of this email. If you are not the addressee indicated in this email or are > not responsible for delivery of the email to such a person, please kindly > reply to the sender indicating this fact and delete all copies of it from > your computer and network server immediately. Your cooperation is highly > appreciated. It is advised that any unauthorized use of confidential > information of Winbond is strictly prohibited; and any information in this > email irrelevant to the official business of Winbond shall be deemed as > neither given nor endorsed by Winbond. >
