Thank Jaanai. At first we thought it was data issue too, but as we restored the table from snapshot to a separate schema on the same cluster to triage, the exception no longer happens... Does that give further clue on what the issue might've been?
0: jdbc:phoenix:journalnode,test> SELECT A, B, C, D FROM SCHEMA.TABLE where A = 13100423; java.nio.BufferUnderflowException at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151) at java.nio.ByteBuffer.get(ByteBuffer.java:715) at org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1028) at org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:375) at org.apache.phoenix.schema.types.PVarcharArray.toObject(PVarcharArray.java:65) at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1011) at org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75) at org.apache.phoenix.jdbc.PhoenixResultSet.getString(PhoenixResultSet.java:609) at sqlline.Rows$Row.<init>(Rows.java:183) at sqlline.BufferedRows.<init>(BufferedRows.java:38) at sqlline.SqlLine.print(SqlLine.java:1660) at sqlline.Commands.execute(Commands.java:833) at sqlline.Commands.sql(Commands.java:732) at sqlline.SqlLine.dispatch(SqlLine.java:813) at sqlline.SqlLine.begin(SqlLine.java:686) at sqlline.SqlLine.start(SqlLine.java:398) at sqlline.SqlLine.main(SqlLine.java:291) 0: jdbc:phoenix:journalnode,test> SELECT A, B, C, D FROM SCHEMA.CORRUPTION where A = 13100423; +-----------+--------+--------+-------------+ | A | B | C | D | +-----------+--------+--------+-------------+ | 13100423 | 5159 | 7 | ['female'] | +-----------+--------+--------+-------------+ 1 row selected (1.76 seconds) On Sun, Oct 14, 2018 at 8:39 PM Jaanai Zhang <cloud.pos...@gmail.com> wrote: > It looks a bug that the remained part greater than retrieved the length in > ByteBuffer, Maybe the position of ByteBuffer or the length of target byte > array exists some problems. > > ---------------------------------------- > Jaanai Zhang > Best regards! > > > > William Shen <wills...@marinsoftware.com> 于2018年10月12日周五 下午11:53写道: > >> Hi all, >> >> We are running Phoenix 4.13, and periodically we would encounter the >> following exception when querying from Phoenix in our staging environment. >> Initially, we thought we had some incompatible client version connecting >> and creating data corruption, but after ensuring that we are only >> connecting with 4.13 clients, we still see this issue come up from time to >> time. So far, fortunately, since it is in staging, we are able to identify >> and delete the data to restore service. >> >> However, would like to ask for guidance on what else we could look for to >> identify the cause of this exception. Could this perhaps caused by >> something other than data corruption? >> >> Thanks in advance! >> >> The exception looks like: >> >> 18/10/12 15:45:58 WARN scheduler.TaskSetManager: Lost task 32.2 in stage >> 14.0 (TID 1275, ...datanode..., executor 82): >> java.nio.BufferUnderflowException >> >> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151) >> >> at java.nio.ByteBuffer.get(ByteBuffer.java:715) >> >> at >> org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1028) >> >> at >> org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:375) >> >> at >> org.apache.phoenix.schema.types.PVarcharArray.toObject(PVarcharArray.java:65) >> >> at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1011) >> >> at >> org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75) >> >> at >> org.apache.phoenix.jdbc.PhoenixResultSet.getObject(PhoenixResultSet.java:525) >> >> at >> org.apache.phoenix.spark.PhoenixRecordWritable$$anonfun$readFields$1.apply$mcVI$sp(PhoenixRecordWritable.scala:96) >> >> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) >> >> at >> org.apache.phoenix.spark.PhoenixRecordWritable.readFields(PhoenixRecordWritable.scala:93) >> >> at >> org.apache.phoenix.mapreduce.PhoenixRecordReader.nextKeyValue(PhoenixRecordReader.java:168) >> >> at >> org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:174) >> >> at >> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) >> >> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) >> >> at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1596) >> >> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157) >> >> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157) >> >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1870) >> >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1870) >> >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) >> >> at org.apache.spark.scheduler.Task.run(Task.scala:89) >> >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:229) >> >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> >> at java.lang.Thread.run(Thread.java:748) >> >> >>