[
https://issues.apache.org/jira/browse/BEAM-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17180687#comment-17180687
]
Beam JIRA Bot commented on BEAM-10464:
--------------------------------------
This issue was marked "stale-assigned" and has not received a public comment in
7 days. It is now automatically unassigned. If you are still working on it, you
can assign it to yourself again. Please also give an update about the status of
the work.
> [HBaseIO] - Protocol message was too large. May be malicious.
> --------------------------------------------------------------
>
> Key: BEAM-10464
> URL: https://issues.apache.org/jira/browse/BEAM-10464
> Project: Beam
> Issue Type: Bug
> Components: beam-community
> Reporter: Marc Catrisse
> Priority: P1
>
> Hi! I just got the following error perfoming a HBaseIO.read() from scan.
> {code:java}
> Caused by:
> org.apache.hadoop.hbase.shaded.com.google.protobuf.InvalidProtocolBufferException:
> Protocol message was too large. May be malicious. Use
> CodedInputStream.setSizeLimit() to increase the size limit. at
> org.apache.hadoop.hbase.shaded.com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
> at
> org.apache.hadoop.hbase.shaded.com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
> at
> org.apache.hadoop.hbase.shaded.com.google.protobuf.CodedInputStream.isAtEnd(CodedInputStream.java:701)
> at
> org.apache.hadoop.hbase.shaded.com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:99)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Result.<init>(ClientProtos.java:4694)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Result.<init>(ClientProtos.java:4658)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Result$1.parsePartialFrom(ClientProtos.java:4767)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Result$1.parsePartialFrom(ClientProtos.java:4762)
> at
> org.apache.hadoop.hbase.shaded.com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
> at
> org.apache.hadoop.hbase.shaded.com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
> at
> org.apache.hadoop.hbase.shaded.com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
> at
> org.apache.hadoop.hbase.shaded.com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
> at
> org.apache.hadoop.hbase.shaded.com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
> at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$Result.parseDelimitedFrom(ClientProtos.java:5131)
> at
> org.apache.beam.sdk.io.hbase.HBaseResultCoder.decode(HBaseResultCoder.java:50)
> at
> org.apache.beam.sdk.io.hbase.HBaseResultCoder.decode(HBaseResultCoder.java:34)
> at org.apache.beam.sdk.coders.Coder.decode(Coder.java:159) at
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:602)
> at
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:593)
> at
> org.apache.beam.sdk.util.WindowedValue$FullWindowedValueCoder.decode(WindowedValue.java:539)
> at
> org.apache.beam.runners.spark.translation.ValueAndCoderLazySerializable.getOrDecode(ValueAndCoderLazySerializable.java:73)
> ... 61 more
> {code}
> It seems I'm scanning a family column bigger than 64MB, but HBaseIO doesn't
> provide any workaround to change the current sizeLimit of the protobuf
> decoder. How should we manage Big Data Datasets stored in HBase?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)