Dug into one of the row that was having a similar problem, throwing
IllegalArgumentException [1] instead of BufferUnderflowException, but both
seemed to be data issue on how the varchar array is stored in an unexpected
format in HBase.

The row looks like:
*A_VARCHAR_OF_170_CHARS*\x00\x00\x00\x80\x01\x00\x00\x02\xAD\x00\x00\x00

I could not make sense of it based on the 4.13 encoding (hence Phoenix is
throwing an exception), and I looked back to 4.8 and it doesn't seem like
the old format either... Anyone recognize the hex encoding by any chance,
or is this some sort of data corruption?

Thanks,

- Will


[1]

java.lang.IllegalArgumentException
        at java.nio.Buffer.position(Buffer.java:244)
        at 
org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1025)
        at 
org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:375)
        at 
org.apache.phoenix.schema.types.PVarcharArray.toObject(PVarcharArray.java:65)
        at 
org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1011)
        at 
org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75)
        at 
org.apache.phoenix.jdbc.PhoenixResultSet.getString(PhoenixResultSet.java:609)
        at sqlline.Rows$Row.<init>(Rows.java:183)
        at sqlline.BufferedRows.<init>(BufferedRows.java:38)
        at sqlline.SqlLine.print(SqlLine.java:1660)
        at sqlline.Commands.execute(Commands.java:833)
        at sqlline.Commands.sql(Commands.java:732)
        at sqlline.SqlLine.dispatch(SqlLine.java:813)
        at sqlline.SqlLine.begin(SqlLine.java:686)
        at sqlline.SqlLine.start(SqlLine.java:398)
        at sqlline.SqlLine.main(SqlLine.java:291)


On Wed, Oct 17, 2018 at 3:21 PM William Shen <wills...@marinsoftware.com>
wrote:

> Thank Jaanai.
>
> At first we thought it was data issue too, but as we restored the table
> from snapshot to a separate schema on the same cluster to triage, the
> exception no longer happens... Does that give further clue on what the
> issue might've been?
>
> 0: jdbc:phoenix:journalnode,test> SELECT A, B, C, D  FROM SCHEMA.TABLE
>  where A = 13100423;
>
> java.nio.BufferUnderflowException
>
> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151)
>
> at java.nio.ByteBuffer.get(ByteBuffer.java:715)
>
> at
> org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1028)
>
> at
> org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:375)
>
> at
> org.apache.phoenix.schema.types.PVarcharArray.toObject(PVarcharArray.java:65)
>
> at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1011)
>
> at
> org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75)
>
> at
> org.apache.phoenix.jdbc.PhoenixResultSet.getString(PhoenixResultSet.java:609)
>
> at sqlline.Rows$Row.<init>(Rows.java:183)
>
> at sqlline.BufferedRows.<init>(BufferedRows.java:38)
>
> at sqlline.SqlLine.print(SqlLine.java:1660)
>
> at sqlline.Commands.execute(Commands.java:833)
>
> at sqlline.Commands.sql(Commands.java:732)
>
> at sqlline.SqlLine.dispatch(SqlLine.java:813)
>
> at sqlline.SqlLine.begin(SqlLine.java:686)
>
> at sqlline.SqlLine.start(SqlLine.java:398)
>
> at sqlline.SqlLine.main(SqlLine.java:291)
>
>
>
> 0: jdbc:phoenix:journalnode,test> SELECT A, B, C, D  FROM SCHEMA.CORRUPTION
> where A = 13100423;
>
> +-----------+--------+--------+-------------+
>
> |    A     | B  | C  |    D     |
>
> +-----------+--------+--------+-------------+
>
> | 13100423  | 5159   | 7      | ['female']  |
>
> +-----------+--------+--------+-------------+
>
> 1 row selected (1.76 seconds)
>
> On Sun, Oct 14, 2018 at 8:39 PM Jaanai Zhang <cloud.pos...@gmail.com>
> wrote:
>
>> It looks a bug that the remained part greater than retrieved the length
>> in ByteBuffer, Maybe the position of ByteBuffer or the length of target
>> byte array exists some problems.
>>
>> ----------------------------------------
>>    Jaanai Zhang
>>    Best regards!
>>
>>
>>
>> William Shen <wills...@marinsoftware.com> 于2018年10月12日周五 下午11:53写道:
>>
>>> Hi all,
>>>
>>> We are running Phoenix 4.13, and periodically we would encounter the
>>> following exception when querying from Phoenix in our staging environment.
>>> Initially, we thought we had some incompatible client version connecting
>>> and creating data corruption, but after ensuring that we are only
>>> connecting with 4.13 clients, we still see this issue come up from time to
>>> time. So far, fortunately, since it is in staging, we are able to identify
>>> and delete the data to restore service.
>>>
>>> However, would like to ask for guidance on what else we could look for
>>> to identify the cause of this exception. Could this perhaps caused by
>>> something other than data corruption?
>>>
>>> Thanks in advance!
>>>
>>> The exception looks like:
>>>
>>> 18/10/12 15:45:58 WARN scheduler.TaskSetManager: Lost task 32.2 in stage
>>> 14.0 (TID 1275, ...datanode..., executor 82):
>>> java.nio.BufferUnderflowException
>>>
>>> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151)
>>>
>>> at java.nio.ByteBuffer.get(ByteBuffer.java:715)
>>>
>>> at
>>> org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1028)
>>>
>>> at
>>> org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:375)
>>>
>>> at
>>> org.apache.phoenix.schema.types.PVarcharArray.toObject(PVarcharArray.java:65)
>>>
>>> at
>>> org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1011)
>>>
>>> at
>>> org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75)
>>>
>>> at
>>> org.apache.phoenix.jdbc.PhoenixResultSet.getObject(PhoenixResultSet.java:525)
>>>
>>> at
>>> org.apache.phoenix.spark.PhoenixRecordWritable$$anonfun$readFields$1.apply$mcVI$sp(PhoenixRecordWritable.scala:96)
>>>
>>> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
>>>
>>> at
>>> org.apache.phoenix.spark.PhoenixRecordWritable.readFields(PhoenixRecordWritable.scala:93)
>>>
>>> at
>>> org.apache.phoenix.mapreduce.PhoenixRecordReader.nextKeyValue(PhoenixRecordReader.java:168)
>>>
>>> at
>>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:174)
>>>
>>> at
>>> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>>>
>>> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>>>
>>> at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1596)
>>>
>>> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157)
>>>
>>> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157)
>>>
>>> at
>>> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1870)
>>>
>>> at
>>> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1870)
>>>
>>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>>
>>> at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>>
>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:229)
>>>
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>
>>> at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>>

Reply via email to