[ 
https://issues.apache.org/jira/browse/BEAM-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180687#comment-17180687
 ] 

Beam JIRA Bot commented on BEAM-10464:
--------------------------------------

This issue was marked "stale-assigned" and has not received a public comment in 
7 days. It is now automatically unassigned. If you are still working on it, you 
can assign it to yourself again. Please also give an update about the status of 
the work.

> [HBaseIO] - Protocol message was too large.  May be malicious.
> --------------------------------------------------------------
>
>                 Key: BEAM-10464
>                 URL: https://issues.apache.org/jira/browse/BEAM-10464
>             Project: Beam
>          Issue Type: Bug
>          Components: beam-community
>            Reporter: Marc Catrisse
>            Priority: P1
>
> Hi! I just got the following error perfoming a HBaseIO.read() from scan. 
> {code:java}
> Caused by: 
> org.apache.hadoop.hbase.shaded.com.google.protobuf.InvalidProtocolBufferException:
>  Protocol message was too large.  May be malicious.  Use 
> CodedInputStream.setSizeLimit() to increase the size limit. at 
> org.apache.hadoop.hbase.shaded.com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
>  at 
> org.apache.hadoop.hbase.shaded.com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
>  at 
> org.apache.hadoop.hbase.shaded.com.google.protobuf.CodedInputStream.isAtEnd(CodedInputStream.java:701)
>  at 
> org.apache.hadoop.hbase.shaded.com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:99)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Result.<init>(ClientProtos.java:4694)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Result.<init>(ClientProtos.java:4658)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Result$1.parsePartialFrom(ClientProtos.java:4767)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Result$1.parsePartialFrom(ClientProtos.java:4762)
>  at 
> org.apache.hadoop.hbase.shaded.com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
>  at 
> org.apache.hadoop.hbase.shaded.com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
>  at 
> org.apache.hadoop.hbase.shaded.com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
>  at 
> org.apache.hadoop.hbase.shaded.com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
>  at 
> org.apache.hadoop.hbase.shaded.com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
>  at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Result.parseDelimitedFrom(ClientProtos.java:5131)
>  at 
> org.apache.beam.sdk.io.hbase.HBaseResultCoder.decode(HBaseResultCoder.java:50)
>  at 
> org.apache.beam.sdk.io.hbase.HBaseResultCoder.decode(HBaseResultCoder.java:34)
>  at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:602)
>  at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:593)
>  at 
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:539)
>  at 
> org.apache.beam.runners.spark.translation.ValueAndCoderLazySerializable.getOrDecode(ValueAndCoderLazySerializable.java:73)
>  ... 61 more
> {code}
> It seems I'm scanning a family column bigger than 64MB, but HBaseIO doesn't 
> provide any workaround to change the current sizeLimit of the protobuf 
> decoder. How should we manage Big Data Datasets stored in HBase?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to