In the pig 0.9.2 source in HBaseStorage there are 2 put.add(family, qualifier, ts, value) statements. Simply invoking the other method put.add(family, qualifier, value) without the 'ts' that pig creates 'long ts=System.currentTimeMillis();' fixed the issue. I now see my coprocessor code being called. Looks like those puts were just trying to use an incorrect row version.
I'll let the pig folks know and create a jira for them. Thanks for the help! On Sun, Mar 25, 2012 at 8:47 AM, Ted Yu <[email protected]> wrote: > Do you know the has() method was called from line 207 or 192 ? > > According to pig code: > public Put createPut(Object key, byte type) throws IOException { > Put put = new Put(objToBytes(key, type)); > Leading to the following ctor: > public Put(byte [] row, RowLock rowLock) { > this(row, HConstants.LATEST_TIMESTAMP, rowLock); > > FYI > > On Sun, Mar 25, 2012 at 1:35 AM, Nick Pinckernell <[email protected]> wrote: > > > Thank you! That got me in the right direction. Yes, my region observer > > overrides prePut(). > > > > Here is what I found out through debugging the region server: > > When using the HBase client API the Put has the correct KeyValue > timestamp > > (which matches up its Mutation 'ts') > > but when using Pig to load it, the timestamps do not match up, thus the > > Put.has() method [line 255] does not return true on line 273 from the > > following check: > > > > if (Arrays.equals(kv.getFamily(), family) && > > Arrays.equals(kv.getQualifier(), qualifier) > > && kv.getTimestamp() == ts) { > > > > failing on 'kv.getTimestamp() == ts' > > > > I'm not yet sure why the KeyValue timestamp (gotten from > > KeyValue.getTimestamp()) is being set incorrectly from the pig load. > > > > On Sat, Mar 24, 2012 at 3:37 PM, Ted Yu <[email protected]> wrote: > > > > > hbase.mapreduce.TableOutputFormat is used by HBaseStorage > > > The Put reaches region server and ends up in HRegion.doMiniBatchPut() > > where > > > I see: > > > > > > if (coprocessorHost != null) { > > > for (int i = 0; i < batchOp.operations.length; i++) { > > > Pair<Put, Integer> nextPair = batchOp.operations[i]; > > > Put put = nextPair.getFirst(); > > > if (coprocessorHost.prePut(put, walEdit, put.getWriteToWAL())) { > > > > > > Was your code in prePut() ? > > > > > > Cheers > > > > > > On Sat, Mar 24, 2012 at 11:19 AM, Nick Pinckernell <[email protected]> > wrote: > > > > > > > Hi, I posted this over at the pig forums and Dmitriy suggested I ask > on > > > the > > > > hbase list as well (original post here: > > > > > > > > > > > > > > http://mail-archives.apache.org/mod_mbox/pig-user/201203.mbox/ajax/%3CCABsY1jQFaiw%3Dbirw3ZukmdwKmY6EV9z75%2BxSTU_%2BmZsyBwsB2A%40mail.gmail.com%3E > > > > ) > > > > > > > > I'm having a possible issue with a simple pig load that writes to an > > > HBase > > > > table. The issue is that when I run the test pig script it does not > > > invoke > > > > the region observer coprocessor on the table. I have verified that > my > > > > coprocessor executes when I use the HBase client API to do a simple > > > put(). > > > > > > > > Simple pig script is as follows (test.pig): > > > > register /dev/hbase-0.92.0/hbase-0.92.0.jar; > > > > register /dev/hbase-0.92.0/lib/zookeeper-3.4.2.jar; > > > > register /dev/hbase-0.92.0/lib/guava-r09.jar; > > > > A = load '/tmp/testdata.csv' using PigStorage(','); > > > > store A into 'hbase://test' using > > > > org.apache.pig.backend.hadoop.hbase.HBaseStorage ('f:t'); > > > > > > > > Using the following environment variables and command: > > > > export HADOOP_HOME=/dev/hadoop-1.0.0 > > > > export PIG_CLASSPATH=/dev/hadoop-1.0.0/conf > > > > export HBASE_HOME=/dev/hbase-0.92.0/ > > > > export PIG_CLASSPATH="`${HBASE_HOME}/bin/hbase > > classpath`:$PIG_CLASSPATH" > > > > /dev/pig-0.9.2/bin/pig -x local -f test.pig > > > > > > > > I have also tried 'pig -x mapreduce' and it still does not seem to > > invoke > > > > the coprocessor. After looking through the HBaseStorage class it > > appears > > > > that the RecordWriter is getting HBase Put objects and that > ultimately > > > > those are getting flushed so I'm not sure why the coprocessor is not > > > > executing. > > > > > > > > Is this by design, or am I missing something about how the output > from > > > the > > > > pig job is being loaded into the HBase table? > > > > Thank you > > > > > > > > > >
