[ 
https://issues.apache.org/jira/browse/HBASE-26021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17368776#comment-17368776
 ] 

Viraj Jasani edited comment on HBASE-26021 at 6/24/21, 6:47 PM:
----------------------------------------------------------------

Here are HBase proto messages:
{code:java}
/**
 * Table Schema
 * Inspired by the rest TableSchema
 */
message TableSchema {
  optional TableName table_name = 1;
  repeated BytesBytesPair attributes = 2;
  repeated ColumnFamilySchema column_families = 3;
  repeated NameStringPair configuration = 4;
}

/** Denotes state of the table */
message TableState {
  // Table's current state
  enum State {
    ENABLED = 0;
    DISABLED = 1;
    DISABLING = 2;
    ENABLING = 3;
  }
  // This is the table's state.
  required State state = 1;
  required TableName table = 2;
  optional uint64 timestamp = 3;
}

/** On HDFS representation of table state. */
message TableDescriptor {
  required TableSchema schema = 1;
  optional TableState.State state = 2 [ default = ENABLED ];
}

{code}
Serialization in FSTableDescriptors:
{code:java}
private static void writeTD(final FileSystem fs, final Path p, final 
TableDescriptor htd)
throws IOException {
  FSDataOutputStream out = fs.create(p, false);
  try {
    // We used to write this file out as a serialized HTD Writable followed by 
two '\n's and then
    // the toString version of HTD.  Now we just write out the pb serialization.
    out.write(htd.toByteArray());
  } finally {
    out.close();
  }
}
{code}
Before this patch, we used to serialize "HTableDescriptor" in place of 
"TableDescriptor" in above method (as [~apurtell] mentioned).

 

TableDescriptor/HTableDescriptor's toByteArray():
{code:java}
/**
 * @return This instance serialized with pb with pb magic prefix
 * @see #parseFrom(byte[])
 */
public byte [] toByteArray() {
  return ProtobufUtil.prependPBMagic(convert().toByteArray());
}
{code}
Here is the main difference in both convert():

HTableDescriptor#convert converts TD to HBaseProtos.TableSchema (mentioned in 
first code block above) whereas TableDescriptor#convert converts TD to 
HBaseProtos.TableDescriptor (mentioned in first code block above). As per Proto 
message definitions, HBaseProtos.TableDescriptor contains 2 fields: 
HBaseProtos.TableSchema and HBaseProtos.TableState.State, so state is the 
additional field.

HBase 2 deserializes only HBaseProtos.TableSchema as per this code:
{code:java}
/**
 * @param bytes A pb serialized {@link ModifyableTableDescriptor} instance
 * with pb magic prefix
 * @return An instance of {@link ModifyableTableDescriptor} made from
 * <code>bytes</code>
 * @throws DeserializationException
 * @see #toByteArray()
 */
private static TableDescriptor parseFrom(final byte[] bytes)
        throws DeserializationException {
  if (!ProtobufUtil.isPBMagicPrefix(bytes)) {
    throw new DeserializationException("Expected PB encoded 
ModifyableTableDescriptor");
  }
  int pblen = ProtobufUtil.lengthOfPBMagic();
  HBaseProtos.TableSchema.Builder builder = 
HBaseProtos.TableSchema.newBuilder();
  try {
    ProtobufUtil.mergeFrom(builder, bytes, pblen, bytes.length - pblen);
    return ProtobufUtil.toTableDescriptor(builder.build());
  } catch (IOException e) {
    throw new DeserializationException(e);
  }
}
{code}
HBase 2's HBase proto does not even have TableDescriptor message, this is odd. 
This means that HBASE-7767 introduced HBaseProtos.TableDescriptor and later it 
was reverted as well and now it's no longer being used.

Let me dig in why this was removed and for what reason. [~apurtell] [~bharathv] 
Although we have released 1.7, but we might not be able to continue with Proto 
incompatibility, specifically when HBase 1 has new TD class with it's own Proto 
definition but on the other hand HBase 2 has it removed. (Edit: more details in 
following comments)

-This brings the question of whether we should really have Protobuf 
compatibility guidelines just like source comaptibility guidelines we follow 
for backward compatible releases.-
{quote}Unless it actually prevents an upgrade from 1.6, which moots the release 
and we will have to withdraw it.
{quote}
Let me test this as well today if possible (else tomorrow) but based on the 
code, it seems we should not have any problem with this. This 
[commit|https://github.com/apache/hbase/commit/431b8a5383b894381583bbb9ceef5911911b705c]
 takes care of backward compatibility.

I assume deserialization should fail from 1.6 to 1.7 and 1.7 has backward 
compat code present after DeserializationException is caught:
{code:java}
private static TableDescriptor readTableDescriptor(FileSystem fs, FileStatus 
status,
    boolean rewritePb) throws IOException {
  int len = Ints.checkedCast(status.getLen());
  byte [] content = new byte[len];
  FSDataInputStream fsDataInputStream = fs.open(status.getPath());
  try {
    fsDataInputStream.readFully(content);
  } finally {
    fsDataInputStream.close();
  }
  TableDescriptor td = null;
  try {
    td = TableDescriptor.parseFrom(content);
  } catch (DeserializationException e) {
    // we have old HTableDescriptor here
    try {
      HTableDescriptor ohtd = HTableDescriptor.parseFrom(content);
      LOG.warn("Found old table descriptor, converting to new format for table 
" +
        ohtd.getTableName());
      td = new TableDescriptor(ohtd);
      if (rewritePb) {
        rewriteTableDescriptor(fs, status, td);
      }
    } catch (DeserializationException e1) {
      throw new IOException("content=" + Bytes.toShort(content), e1);
    }
  }
  if (rewritePb && !ProtobufUtil.isPBMagicPrefix(content)) {
    // Convert the file over to be pb before leaving here.
    rewriteTableDescriptor(fs, status, td);
  }
  return td;
}{code}


was (Author: vjasani):
Here are HBase proto messages:
{code:java}
/**
 * Table Schema
 * Inspired by the rest TableSchema
 */
message TableSchema {
  optional TableName table_name = 1;
  repeated BytesBytesPair attributes = 2;
  repeated ColumnFamilySchema column_families = 3;
  repeated NameStringPair configuration = 4;
}

/** Denotes state of the table */
message TableState {
  // Table's current state
  enum State {
    ENABLED = 0;
    DISABLED = 1;
    DISABLING = 2;
    ENABLING = 3;
  }
  // This is the table's state.
  required State state = 1;
  required TableName table = 2;
  optional uint64 timestamp = 3;
}

/** On HDFS representation of table state. */
message TableDescriptor {
  required TableSchema schema = 1;
  optional TableState.State state = 2 [ default = ENABLED ];
}

{code}
Serialization in FSTableDescriptors:
{code:java}
private static void writeTD(final FileSystem fs, final Path p, final 
TableDescriptor htd)
throws IOException {
  FSDataOutputStream out = fs.create(p, false);
  try {
    // We used to write this file out as a serialized HTD Writable followed by 
two '\n's and then
    // the toString version of HTD.  Now we just write out the pb serialization.
    out.write(htd.toByteArray());
  } finally {
    out.close();
  }
}
{code}
Before this patch, we used to serialize "HTableDescriptor" in place of 
"TableDescriptor" in above method (as [~apurtell] mentioned).

 

TableDescriptor/HTableDescriptor's toByteArray():
{code:java}
/**
 * @return This instance serialized with pb with pb magic prefix
 * @see #parseFrom(byte[])
 */
public byte [] toByteArray() {
  return ProtobufUtil.prependPBMagic(convert().toByteArray());
}
{code}
Here is the main difference in both convert():

HTableDescriptor#convert converts TD to HBaseProtos.TableSchema (mentioned in 
first code block above) whereas TableDescriptor#convert converts TD to 
HBaseProtos.TableDescriptor (mentioned in first code block above). As per Proto 
message definitions, HBaseProtos.TableDescriptor contains 2 fields: 
HBaseProtos.TableSchema and HBaseProtos.TableState.State, so state is the 
additional field.

HBase 2 deserializes only HBaseProtos.TableSchema as per this code:
{code:java}
/**
 * @param bytes A pb serialized {@link ModifyableTableDescriptor} instance
 * with pb magic prefix
 * @return An instance of {@link ModifyableTableDescriptor} made from
 * <code>bytes</code>
 * @throws DeserializationException
 * @see #toByteArray()
 */
private static TableDescriptor parseFrom(final byte[] bytes)
        throws DeserializationException {
  if (!ProtobufUtil.isPBMagicPrefix(bytes)) {
    throw new DeserializationException("Expected PB encoded 
ModifyableTableDescriptor");
  }
  int pblen = ProtobufUtil.lengthOfPBMagic();
  HBaseProtos.TableSchema.Builder builder = 
HBaseProtos.TableSchema.newBuilder();
  try {
    ProtobufUtil.mergeFrom(builder, bytes, pblen, bytes.length - pblen);
    return ProtobufUtil.toTableDescriptor(builder.build());
  } catch (IOException e) {
    throw new DeserializationException(e);
  }
}
{code}
HBase 2's HBase proto does not even have TableDescriptor message, this is odd. 
This means that HBASE-7767 introduced HBaseProtos.TableDescriptor and later it 
was reverted as well and now it's no longer being used.

Let me dig in why this was removed and for what reason. [~apurtell] [~bharathv] 
Although we have released 1.7, but we can't continue to live with Proto 
incompatibility, specifically case like this where HBase 1 has one big refactor 
present with new TD class with it's own Proto definition but on the other hand 
HBase 2 doesn't even use that proto definition anywhere, so I am more inclined 
to revert this commit if it is possible for MasterRegistry to live without it. 
And if we have consensus, after fixing this we should immediately release 1.8 
after upgrade testing.

This brings the question of whether we should really have Protobuf 
compatibility guidelines just like source comaptibility guidelines we follow 
for backward compatible releases.
{quote}Unless it actually prevents an upgrade from 1.6, which moots the release 
and we will have to withdraw it.
{quote}
Let me test this as well today if possible (else tomorrow) but based on the 
code, it seems we should not have any problem with this. This 
[commit|https://github.com/apache/hbase/commit/431b8a5383b894381583bbb9ceef5911911b705c]
 takes care of backward compatibility.

I assume deserialization should fail from 1.6 to 1.7 and 1.7 has backward 
compat code present after DeserializationException is caught:
{code:java}
private static TableDescriptor readTableDescriptor(FileSystem fs, FileStatus 
status,
    boolean rewritePb) throws IOException {
  int len = Ints.checkedCast(status.getLen());
  byte [] content = new byte[len];
  FSDataInputStream fsDataInputStream = fs.open(status.getPath());
  try {
    fsDataInputStream.readFully(content);
  } finally {
    fsDataInputStream.close();
  }
  TableDescriptor td = null;
  try {
    td = TableDescriptor.parseFrom(content);
  } catch (DeserializationException e) {
    // we have old HTableDescriptor here
    try {
      HTableDescriptor ohtd = HTableDescriptor.parseFrom(content);
      LOG.warn("Found old table descriptor, converting to new format for table 
" +
        ohtd.getTableName());
      td = new TableDescriptor(ohtd);
      if (rewritePb) {
        rewriteTableDescriptor(fs, status, td);
      }
    } catch (DeserializationException e1) {
      throw new IOException("content=" + Bytes.toShort(content), e1);
    }
  }
  if (rewritePb && !ProtobufUtil.isPBMagicPrefix(content)) {
    // Convert the file over to be pb before leaving here.
    rewriteTableDescriptor(fs, status, td);
  }
  return td;
}{code}

> HBase 1.7 to 2.4 upgrade issue due to incompatible deserialization
> ------------------------------------------------------------------
>
>                 Key: HBASE-26021
>                 URL: https://issues.apache.org/jira/browse/HBASE-26021
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.7.0, 2.4.4
>            Reporter: Viraj Jasani
>            Priority: Major
>         Attachments: Screenshot 2021-06-22 at 12.54.21 PM.png, Screenshot 
> 2021-06-22 at 12.54.30 PM.png
>
>
> As of today, if we bring up HBase cluster using branch-1 and upgrade to 
> branch-2.4, we are facing issue while parsing namespace from HDFS fileinfo. 
> Instead of "*hbase:meta*" and "*hbase:namespace*", parsing using ProtobufUtil 
> seems to be producing "*\n hbase meta*" and "*\n hbase namespace*"
> {code:java}
> 2021-06-22 00:05:56,611 INFO  
> [RpcServer.priority.RWQ.Fifo.read.handler=3,queue=1,port=16025] 
> regionserver.RSRpcServices: Open hbase:meta,,1.1588230740
> 2021-06-22 00:05:56,648 INFO  
> [RpcServer.priority.RWQ.Fifo.read.handler=5,queue=1,port=16025] 
> regionserver.RSRpcServices: Open 
> hbase:namespace,,1624297762817.396cb6cc00cd4334cb1ea3a792d7529a.
> 2021-06-22 00:05:56,759 ERROR 
> [RpcServer.priority.RWQ.Fifo.read.handler=5,queue=1,port=16025] 
> ipc.RpcServer: Unexpected throwable object
> java.lang.IllegalArgumentException: Illegal character <
> > at 0. Namespaces may only contain 'alphanumeric characters' from any 
> > language and digits:
> ^Ehbase^R       namespace
>         at 
> org.apache.hadoop.hbase.TableName.isLegalNamespaceName(TableName.java:246)
>         at 
> org.apache.hadoop.hbase.TableName.isLegalNamespaceName(TableName.java:220)
>         at org.apache.hadoop.hbase.TableName.<init>(TableName.java:348)
>         at 
> org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:385)
>         at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:508)
>         at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toTableName(ProtobufUtil.java:2292)
>         at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toTableDescriptor(ProtobufUtil.java:2937)
>         at 
> org.apache.hadoop.hbase.client.TableDescriptorBuilder$ModifyableTableDescriptor.parseFrom(TableDescriptorBuilder.java:1625)
>         at 
> org.apache.hadoop.hbase.client.TableDescriptorBuilder$ModifyableTableDescriptor.access$200(TableDescriptorBuilder.java:597)
>         at 
> org.apache.hadoop.hbase.client.TableDescriptorBuilder.parseFrom(TableDescriptorBuilder.java:320)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.readTableDescriptor(FSTableDescriptors.java:511)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorFromFs(FSTableDescriptors.java:496)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorFromFs(FSTableDescriptors.java:482)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:210)
>         at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.openRegion(RSRpcServices.java:2112)
>         at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:35218)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:395)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
>         at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
>         at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318)
> 2021-06-22 00:05:56,759 ERROR 
> [RpcServer.priority.RWQ.Fifo.read.handler=3,queue=1,port=16025] 
> ipc.RpcServer: Unexpected throwable object
> java.lang.IllegalArgumentException: Illegal character <
> > at 0. Namespaces may only contain 'alphanumeric characters' from any 
> > language and digits:
> ^Ehbase^R^Dmeta
>         at 
> org.apache.hadoop.hbase.TableName.isLegalNamespaceName(TableName.java:246)
>         at 
> org.apache.hadoop.hbase.TableName.isLegalNamespaceName(TableName.java:220)
>         at org.apache.hadoop.hbase.TableName.<init>(TableName.java:348)
>         at 
> org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:385)
>         at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:508)
>         at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toTableName(ProtobufUtil.java:2292)
>         at 
> org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.toTableDescriptor(ProtobufUtil.java:2937)
>         at 
> org.apache.hadoop.hbase.client.TableDescriptorBuilder$ModifyableTableDescriptor.parseFrom(TableDescriptorBuilder.java:1625)
>         at 
> org.apache.hadoop.hbase.client.TableDescriptorBuilder$ModifyableTableDescriptor.access$200(TableDescriptorBuilder.java:597)
>         at 
> org.apache.hadoop.hbase.client.TableDescriptorBuilder.parseFrom(TableDescriptorBuilder.java:320)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.readTableDescriptor(FSTableDescriptors.java:511)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorFromFs(FSTableDescriptors.java:496)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorFromFs(FSTableDescriptors.java:482)
>         at 
> org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:210)
>         at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.openRegion(RSRpcServices.java:2112)
>         at 
> org.apache.hadoop.hbase.shaded.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:35218)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:395)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
>         at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:338)
>         at 
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:318){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to