Hi,
I have a text file which has my hbase table information. It is comma separated.
The first is attribute name (which I want it to be as column qualifier) and the
second is attribute value. The file looks like this:
COMMON_NAME,corn
SCIENTIFIC_NAME,Zea mays
GENETIC_BACKGROUND,LH244
TISSUE,tassel
DEV_STAGE,V7-V8
TREATMENT,"Microspore mothercell stage (V7-V8), <0.5in"
ECTOPIC_TYPE,
So I want to load this file and store it into the hbase table. The table schema
is discovery_rnaseq_library (A: attribute_name, value: attribute_value)
Here is my pig script:
library_tag = LOAD '/my_path/345_lib_description.txt' USING PigStorage(',') AS
(tag:chararray, value:chararray);
library_id = LOAD 'discovery_rnaseq_library' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('A:organism' ,'-loadKey true')
as (id:int, name:chararray);
grpd = group library_id all;
data_id = foreach grpd generate ((MAX(library_id.id))+1) as id;
finalData = CROSS data_id, library_tag;
STORE library_tag INTO 'hbase://library' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('A:tag A:value');
And I also scan the table to get the max id and use it to insert new record.
The problem is it just insert all of the records with "tag" as column
qualifier. Here is what I get after running this pig script:
COLUMN CELL
A:tag timestamp=1348451755196,
value=REPLICATE_NUMBER
A:value timestamp=1348451755196, value=1
Whereas I want it to be something like this:
COLUMN CELL
A: COMMON_NAME timestamp=1348451755196, value=corn
A:SCIENTIFIC_NAME timestamp=1348451755196, value= Zea mays
...
I highly appreciate any comments.
Thanks!
-Zara
This e-mail message may contain privileged and/or confidential information, and
is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please
notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of
this e-mail by you is strictly prohibited.
All e-mails and attachments sent and received are subject to monitoring,
reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking
for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage
caused by any such code transmitted by or accompanying
this e-mail or any attachment.
The information contained in this email may be subject to the export control
laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and
sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this
information you are obligated to comply with all
applicable U.S. export laws and regulations.