HBaseStorage writes to the column descriptors specified in the constructor
and in your case you're telling it to use 'A:tag A:value'. If you want to
write to other columns you need to statically define them there.

If you want to use a dynamic column names, you could subclass HBaseStorage
and re-implement the part where the HBase Put happens to use your tag value
as the column descriptor instead of a static column list.

On Mon, Sep 24, 2012 at 7:30 AM, HAJIHASHEMI, ZAHRA (AG/1000) <
[email protected]> wrote:

> Hi,
>
>
>
> I have a text file which has my hbase table information. It is comma
> separated. The first is attribute name (which I want it to be as column
> qualifier) and the second is attribute value. The file looks like this:
>
> COMMON_NAME,corn
>
> SCIENTIFIC_NAME,Zea mays
>
> GENETIC_BACKGROUND,LH244
>
> TISSUE,tassel
>
> DEV_STAGE,V7-V8
>
> TREATMENT,"Microspore mothercell stage (V7-V8), <0.5in"
>
> ECTOPIC_TYPE,
>
>
>
> So I want to load this file and store it into the hbase table. The table
> schema is discovery_rnaseq_library (A: attribute_name, value:
> attribute_value)
>
> Here is my pig script:
>
>
> library_tag = LOAD '/my_path/345_lib_description.txt' USING
> PigStorage(',') AS (tag:chararray, value:chararray);
> library_id = LOAD 'discovery_rnaseq_library' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('A:organism' ,'-loadKey
> true') as (id:int, name:chararray);
> grpd = group library_id all;
> data_id = foreach grpd generate ((MAX(library_id.id))+1) as id;
> finalData = CROSS data_id, library_tag;
> STORE library_tag INTO 'hbase://library' USING
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('A:tag A:value');
>
>
>
> And I also scan the table to get the max id and use it to insert new
> record.
>
>
>
> The problem is it just insert all of the records with "tag" as column
> qualifier. Here is what I get after running this pig script:
>
> COLUMN                               CELL
>
>  A:tag                               timestamp=1348451755196,
> value=REPLICATE_NUMBER
>
>  A:value                             timestamp=1348451755196, value=1
>
> Whereas I want it to be something like this:
> COLUMN                               CELL
> A: COMMON_NAME      timestamp=1348451755196, value=corn
>
> A:SCIENTIFIC_NAME   timestamp=1348451755196, value= Zea mays
>
> ...
>
>
> I highly appreciate any comments.
>
> Thanks!
>
> -Zara
>
> This e-mail message may contain privileged and/or confidential
> information, and is intended to be received only by persons entitled
> to receive such information. If you have received this e-mail in error,
> please notify the sender immediately. Please delete it and
> all attachments from any servers, hard drives or any other media. Other
> use of this e-mail by you is strictly prohibited.
>
> All e-mails and attachments sent and received are subject to monitoring,
> reading and archival by Monsanto, including its
> subsidiaries. The recipient of this e-mail is solely responsible for
> checking for the presence of "Viruses" or other "Malware".
> Monsanto, along with its subsidiaries, accepts no liability for any damage
> caused by any such code transmitted by or accompanying
> this e-mail or any attachment.
>
>
> The information contained in this email may be subject to the export
> control laws and regulations of the United States, potentially
> including but not limited to the Export Administration Regulations (EAR)
> and sanctions regulations issued by the U.S. Department of
> Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this
> information you are obligated to comply with all
> applicable U.S. export laws and regulations.
>



-- 
*Note that I'm no longer using my Yahoo! email address. Please email me at
[email protected] going forward.*
  • RE: Pig help HAJIHASHEMI, ZAHRA (AG/1000)
    • Re: Pig help Bill Graham

Reply via email to