[ 
https://issues.apache.org/jira/browse/PIG-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jarek Jarcec Cecho updated PIG-3390:
------------------------------------

    Attachment: PIG-3390.patch

I'm attaching preliminary patch that get basic support for HBase 0.95 working. 
I'm not changing status to {{Patch available}} as the patch is not yet ready 
for commit. Nevertheless, I would appreciate any feedback.

I've tweaked the {{ivy}} to include two HBase profiles, one for HBase 0.94- and 
second for 0.95+. It seems transitive dependencies of 0.95 are not currently 
resolved properly, so I had to temporarily specify all of them manually (seems 
to be tracked by HBASE-8488).

For the missing APis:

* {{Scan.write(DataOutput)}} It seems that we used this to manually serialize 
the {{Scan}} into Mapreduce job. I've used {{TableInputFormat}} to that for us. 
This way seems to be working for both 0.94- and 0.95+.
* {{Mutation.setWriteToWAL(Boolean}} was superseded by 
{{Mutation.setDurability(Durability)}}. Unfortunately I did not find clean way 
how to overcome this API change. Current patch uses reflection to detect the 
HBase version and to call the proper API.

To test it out you can use following commands:

{code}
ant clean test -Dtestcase=TestHBaseStorage -Dhbaseversion=95
ant clean test -Dtestcase=TestHBaseStorage -Dhbaseversion=94 # default
{code}

Using Hadoop 2 won't currently work as the HBase artifacts for Hadoop 2 are not 
published, but the in the future it should work the following way:

{code}
ant clean test -Dtestcase=TestHBaseStorage -Dhbaseversion=95 -Dhadoopversion=23 
-Dhbasecompat=2
ant clean test -Dtestcase=TestHBaseStorage -Dhbaseversion=94 -Dhadoopversion=23 
-Dhbasecompat=2
{code}

I'll be more than happy to hear any feedback on my approach!
                
> Make pig working with HBase 0.95
> --------------------------------
>
>                 Key: PIG-3390
>                 URL: https://issues.apache.org/jira/browse/PIG-3390
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Jarek Jarcec Cecho
>         Attachments: PIG-3390.patch
>
>
> The HBase 0.95 changed API in incompatible way. Following APIs that 
> {{HBaseStorage}} in Pig uses are no longer available:
> * {{Mutation.setWriteToWAL(Boolean)}}
> * {{Scan.write(DataOutput)}}
> Also in addition the HBase is no longer available as one monolithic archive 
> with entire functionality, but was broken down into smaller pieces such as 
> {{hbase-client}}, {{hbase-server}}, ...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to