Dmitriy, I followed up with the HBase user group which lead me to track
down the reason my region observer coprocessor was not firing. It was due
to a timestamp 'ts' in the method putNext(Tuple t) in HBaseStorage.java
which was a mismatch from the timestamp that was being compared in the
org.apache.hadoop.hbase.client.Put class on method has(byte [] family,
byte [] qualifier, long ts, byte [] value, boolean ignoreTS, boolean
ignoreValue) at:
if (Arrays.equals(kv.getFamily(), family) &&
Arrays.equals(kv.getQualifier(), qualifier)
&& kv.getTimestamp() == ts) {
The fix is simple, just removing the 'ts' long, and invoke the other
Put.add() method which does not use the versioning timestamp.
Jira 2615 created with patch.
On Sat, Mar 24, 2012 at 11:41 AM, Nick Pinckernell <[email protected]> wrote:
> Sure, I will bring it up with the HBase folks and then open a Jira if
> necessary.
> Thanks!
>
>
> On Fri, Mar 23, 2012 at 6:59 PM, Dmitriy Ryaboy <[email protected]>wrote:
>
>> HBaseStorage was implemented for 0.90, which didn't have coprocessors, so
>> I
>> suppose this isn't terribly surprising. Do you mind opening a Jira, and
>> perhaps pointing it out on the HBase dev list? We aren't using any special
>> HBase apis in Pig, so it should just work.. maybe they deprecated
>> something?
>>
>> D
>>
>> On Thu, Mar 22, 2012 at 10:54 PM, Nick <[email protected]> wrote:
>>
>> > I'm having a possible issue with a simple pig load that writes to an
>> HBase
>> > table. The issue is that when I run the test pig script it does not
>> invoke
>> > the region observer coprocessor on the table. I have verified that my
>> > coprocessor executes when I use the HBase client API to do a simple
>> put().
>> >
>> > Simple pig script is as follows (test.pig):
>> > register /dev/hbase-0.92.0/hbase-0.92.0.jar;
>> > register /dev/hbase-0.92.0/lib/zookeeper-3.4.2.jar;
>> > register /dev/hbase-0.92.0/lib/guava-r09.jar;
>> > A = load '/tmp/testdata.csv' using PigStorage(',');
>> > store A into 'hbase://test' using
>> > org.apache.pig.backend.hadoop.hbase.HBaseStorage ('f:t');
>> >
>> > Using the following environment variables and command:
>> > export HADOOP_HOME=/dev/hadoop-1.0.0
>> > export PIG_CLASSPATH=/dev/hadoop-1.0.0/conf
>> > export HBASE_HOME=/dev/hbase-0.92.0/
>> > export PIG_CLASSPATH="`${HBASE_HOME}/bin/hbase
>> classpath`:$PIG_CLASSPATH"
>> > /dev/pig-0.9.2/bin/pig -x local -f test.pig
>> >
>> > I have also tried 'pig -x mapreduce' and it still does not seem to
>> invoke
>> > the coprocessor. After looking through the HBaseStorage class it
>> appears
>> > that the RecordWriter is getting HBase Put objects and that ultimately
>> > those are getting flushed so I'm not sure why the coprocessor is not
>> > executing.
>> >
>> > Is this by design, or am I missing something about how the output from
>> the
>> > pig job is being loaded into the HBase table?
>> > Thank you
>> >
>>
>
>