Hello,

I am having some issues reading int64 values from an HBase table using thrift 
from the nodejs thrift npm.

In HBase, a TCell is defined as containing two fields: an int64 timestamp and a 
byte array called value.

However, when a TCell is read by the thrift npm in the 
thrift/lib/thrift/transport.js file, it is interpreted as a utf8 string and not 
a byte array and some values seem to get lost in the process:

readString: function(len) {
  this.ensureAvailable(len)
  var str = this.inBuf.toString('utf8', this.readCursor, this.readCursor + len);
  this.readCursor += len;
  return str;
},

For example, when I look at my row in the HBase shell, I see

value=\x00\x00\x00\x00\x00\x01\xA6\x94

When I fetch it from nodejs, I get 00 00 00 00 00 01 ef bf bd ef bf bd

Basically, anytime a 2-byte hexadecimal value is higher than \x7F, the value I 
get in javascript is 'ef bf bd'

Also, in the code snippet from transport.js, if I interpret the data stream as 
binary instead of utf8, then the value is passed correctly to my code
var str = this.inBuf.toString('binary', this.readCursor, this.readCursor + len);

I guess that making this change, however, would imply that the clients of the 
npm module would need to cast their own strings on a per-column basis.

So I guess that my question is the following:

Even though the name of the function in transport.js is readString, it seems to 
be used to read byte arrays in some cases( at least in the context of reading a 
TCell from HBase ), am I right, or did I miss something?

Also, is there any other way with which I could read an Int64 from HBase using 
the thrift npm?

Thanks a lot
Jeremie
*********************************************************************** This 
e-mail and attachments are confidential, legally privileged, may be subject to 
copyright and sent solely for the attention of the addressee(s). Any 
unauthorized use or disclosure is prohibited. Statements and opinions expressed 
in this e-mail may not represent those of Radialpoint. 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Le contenu du présent 
courriel est confidentiel, privilégié et peut être soumis à des droits 
d'auteur. Il est envoyé à l'intention exclusive de son ou de ses destinataires. 
Il est interdit de l'utiliser ou de le divulguer sans autorisation. Les 
opinions exprimées dans le présent courriel peuvent diverger de celles de 
Radialpoint.

Reply via email to