Re: Hive UDF for creating row key in HBASE
Hi Chethan, As Ethan mentioned, take a look first at the Phoenix/Hive integration. If that doesn't work for you, the best way to get the row key for a phoenix table is to execute an UPSERT VALUES against the primary key columns without committing it. We have a utility function that will return the Cells that would be submitted to the server that you can use to get the row key. You can do this through a "connectionless" JDBC Connection, so you don't need any RPCs (including executing the CREATE TABLE call so that Phoenix knows the metadata). Take a look at ConnectionlessTest.testConnectionlessUpsert() for an example. Thanks, James On Sun, Dec 17, 2017 at 1:19 PM, Ethanwrote: > > Hi Chethan, > > When you write data from HDFS, are you planning to use hive to do the ETL? > Can we do something like reading from HDFS and use Phoenix to write into to > HBASE? > > There is https://phoenix.apache.org/hive_storage_handler.html, I think is > enabling Hive to read from phoenix table, not the other way around. > > Thanks, > > On December 16, 2017 at 8:09:10 PM, Chethan Bhawarlal ( > cbhawar...@collectivei.com) wrote: > > Hi Dev, > > Currently I am planning to write data from HDFS to HBASE. And to read data > I am using Phoenix. > > Phoenix is converting its primary keys separated by bytes("\x00") and > storing it in HBASE as row key. > > I want to write a custom UDF in hive to create ROW KEY value of HBASE such > that Phoenix will be able to split it into multiple columns. > > Following is the custom UDF code I am trying to write; > > > import org.apache.hadoop.hive.ql.exec.Description; > > import org.apache.hadoop.hive.ql.exec.UDF; > > import org.apache.hadoop.hive.ql.udf.UDFType; > > > @UDFType(stateful = true) > > @Description(name = "hbasekeygenerator", value = "_FUNC_(existing) - > Returns a unique rowkey value for hbase") > > public class CIHbaseKeyGenerator extends UDF{ > > public String evaluate(String [] args){ > > byte zerobyte = 0x00; > > String zbyte = Byte.toString(zerobyte); > > StringBuilder sb = new StringBuilder(); > > > for (int i = 0; i < args.length-1;++i) { > > sb.append(args[i]); > > sb.append(zbyte); > > > } > > sb.append(args[args.length-1]); > > return sb.toString(); > > } > > } > > > Following are my questions, > > > 1.is it possible to emulate the behavior of phoenix(decoding) using hive > custom UDF. > > > 2. If it is possible, what is the better approach for this. It will be > great if some one can share some pointers on this. > > > Thanks, > > Chethan. > > > > > > > > > > > Collective[i] dramatically improves sales and marketing performance using > technology, applications and a revolutionary network designed to provide > next generation analytics and decision-support directly to business users. > Our goal is to maximize human potential and minimize mistakes. In most > cases, the results are astounding. We cannot, however, stop emails from > sometimes being sent to the wrong person. If you are not the intended > recipient, please notify us by replying to this email's sender and deleting > it (and any attachments) permanently from your system. If you are, please > respect the confidentiality of this communication's contents. > >
Re: Hive UDF for creating row key in HBASE
Hi Chethan, When you write data from HDFS, are you planning to use hive to do the ETL? Can we do something like reading from HDFS and use Phoenix to write into to HBASE? There is https://phoenix.apache.org/hive_storage_handler.html, I think is enabling Hive to read from phoenix table, not the other way around. Thanks, On December 16, 2017 at 8:09:10 PM, Chethan Bhawarlal (cbhawar...@collectivei.com) wrote: Hi Dev, Currently I am planning to write data from HDFS to HBASE. And to read data I am using Phoenix. Phoenix is converting its primary keys separated by bytes("\x00") and storing it in HBASE as row key. I want to write a custom UDF in hive to create ROW KEY value of HBASE such that Phoenix will be able to split it into multiple columns. Following is the custom UDF code I am trying to write; import org.apache.hadoop.hive.ql.exec.Description; import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.hive.ql.udf.UDFType; @UDFType(stateful = true) @Description(name = "hbasekeygenerator", value = "_FUNC_(existing) - Returns a unique rowkey value for hbase") public class CIHbaseKeyGenerator extends UDF{ public String evaluate(String [] args){ byte zerobyte = 0x00; String zbyte = Byte.toString(zerobyte); StringBuilder sb = new StringBuilder(); for (int i = 0; i < args.length-1;++i) { sb.append(args[i]); sb.append(zbyte); } sb.append(args[args.length-1]); return sb.toString(); } } Following are my questions, 1.is it possible to emulate the behavior of phoenix(decoding) using hive custom UDF. 2. If it is possible, what is the better approach for this. It will be great if some one can share some pointers on this. Thanks, Chethan. Collective[i] dramatically improves sales and marketing performance using technology, applications and a revolutionary network designed to provide next generation analytics and decision-support directly to business users. Our goal is to maximize human potential and minimize mistakes. In most cases, the results are astounding. We cannot, however, stop emails from sometimes being sent to the wrong person. If you are not the intended recipient, please notify us by replying to this email's sender and deleting it (and any attachments) permanently from your system. If you are, please respect the confidentiality of this communication's contents.
Hive UDF for creating row key in HBASE
Hi Dev, Currently I am planning to write data from HDFS to HBASE. And to read data I am using Phoenix. Phoenix is converting its primary keys separated by bytes("\x00") and storing it in HBASE as row key. I want to write a custom UDF in hive to create ROW KEY value of HBASE such that Phoenix will be able to split it into multiple columns. Following is the custom UDF code I am trying to write; import org.apache.hadoop.hive.ql.exec.Description; import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.hive.ql.udf.UDFType; @UDFType(stateful = true) @Description(name = "hbasekeygenerator", value = "_FUNC_(existing) - Returns a unique rowkey value for hbase") public class CIHbaseKeyGenerator extends UDF{ public String evaluate(String [] args){ byte zerobyte = 0x00; String zbyte = Byte.toString(zerobyte); StringBuilder sb = new StringBuilder(); for (int i = 0; i < args.length-1;++i) { sb.append(args[i]); sb.append(zbyte); } sb.append(args[args.length-1]); return sb.toString(); } } Following are my questions, 1.is it possible to emulate the behavior of phoenix(decoding) using hive custom UDF. 2. If it is possible, what is the better approach for this. It will be great if some one can share some pointers on this. Thanks, Chethan. -- Collective[i] dramatically improves sales and marketing performance using technology, applications and a revolutionary network designed to provide next generation analytics and decision-support directly to business users. Our goal is to maximize human potential and minimize mistakes. In most cases, the results are astounding. We cannot, however, stop emails from sometimes being sent to the wrong person. If you are not the intended recipient, please notify us by replying to this email's sender and deleting it (and any attachments) permanently from your system. If you are, please respect the confidentiality of this communication's contents.