D*mn. Making me feel smart for once. :) Are you sure there's no old HFile hanging around?
----- Original Message ----- From: Rohit Kelkar <[email protected]> To: "[email protected]" <[email protected]> Cc: Sent: Tuesday, July 16, 2013 4:39 PM Subject: Re: Scanner problem after bulk load hfile Now its working correctly. I had to do a myTableWriter.appendTrackedTimestampsToMetadata(); after writing my KVs and before closing the file. - R On Tue, Jul 16, 2013 at 6:20 PM, Rohit Kelkar <[email protected]> wrote: > Oh wait. Didn't realise that I had the HbaseAdmin major compact code > turned on when I tested :( > It is still not working. Following is the code > > StoreFile.Writer myHfileWriter = new StoreFile.WriterBuilder(hbaseConf, > new CacheConfig(hbaseConf), hdfs, > HFile.DEFAULT_BLOCKSIZE).withFilePath(myHFilePath).build(); > KeyValue kv = new KeyValue(row.getBytes(), cf.getBytes(), > keyStr.getBytes(), System.currentTimeMillis(), valueStr.getBytes()); > myHfileWriter.close() > > - R > > > On Tue, Jul 16, 2013 at 6:15 PM, Ted Yu <[email protected]> wrote: > >> Looks like the following should be put in RefGuide. >> >> Cheers >> >> On Tue, Jul 16, 2013 at 3:40 PM, lars hofhansl <[email protected]> wrote: >> >> > Hah. Was *just* about to reply with this. The fix in HBASE-8055 is not >> > strictly necessary. >> > How did you create your HFiles? See this comment: >> > >> https://issues.apache.org/jira/browse/HBASE-8055?focusedCommentId=13600499&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13600499 >> > >> > -- Lars >> > ________________________________ >> > From: Jimmy Xiang <[email protected]> >> > To: user <[email protected]> >> > Sent: Tuesday, July 16, 2013 2:41 PM >> > Subject: Re: Scanner problem after bulk load hfile >> > >> > >> > HBASE-8055 should have fixed it. >> > >> > >> > On Tue, Jul 16, 2013 at 2:33 PM, Rohit Kelkar <[email protected]> >> > wrote: >> > >> > > This ( http://pastebin.com/yhx4apCG ) is the error on the region >> server >> > > side when execute the following on the shell - >> > > get 'mytable', 'myrow', 'cf:q' >> > > >> > > - R >> > > >> > > >> > > >> > > >> > > On Tue, Jul 16, 2013 at 3:28 PM, Jimmy Xiang <[email protected]> >> > wrote: >> > > >> > > > Do you see any exception/logging in the region server side? >> > > > >> > > > >> > > > On Tue, Jul 16, 2013 at 1:15 PM, Rohit Kelkar < >> [email protected]> >> > > > wrote: >> > > > >> > > > > Yes. I tried everything from myTable.flushCommits() to >> > > > > myTable.clearRegionCache() before and after the >> > > > > LoadIncrementalHFiles.doBulkLoad(). But it doesn't seem to work. >> This >> > > is >> > > > > what I am doing right now to get things moving although I think >> this >> > > may >> > > > > not be the recommended approach - >> > > > > >> > > > > HBaseAdmin hbaseAdmin = new HBaseAdmin(hbaseConf); >> > > > > hbaseAdmin.majorCompact(myTableName.getBytes()); >> > > > > myTable.close(); >> > > > > hbaseAdmin.close(); >> > > > > >> > > > > - R >> > > > > >> > > > > >> > > > > On Mon, Jul 15, 2013 at 9:14 AM, Amit Sela <[email protected]> >> > > wrote: >> > > > > >> > > > > > Well, I know it's kind of voodoo but try it once before >> pre-split >> > and >> > > > > once >> > > > > > after. Worked for me. >> > > > > > >> > > > > > >> > > > > > On Mon, Jul 15, 2013 at 7:27 AM, Rohit Kelkar < >> > [email protected] >> > > > >> > > > > > wrote: >> > > > > > >> > > > > > > Thanks Amit, I am also using 0.94.2 . I am also pre-splitting >> > and I >> > > > > tried >> > > > > > > the table.clearRegionCache() but still doesn't work. >> > > > > > > >> > > > > > > - R >> > > > > > > >> > > > > > > >> > > > > > > On Sun, Jul 14, 2013 at 3:45 AM, Amit Sela < >> [email protected]> >> > > > > wrote: >> > > > > > > >> > > > > > > > If new regions are created during the bulk load (are you >> > > > > pre-splitting >> > > > > > > ?), >> > > > > > > > maybe try myTable.clearRegionCache() after the bulk load (or >> > even >> > > > > after >> > > > > > > the >> > > > > > > > pre-splitting if you do pre-split). >> > > > > > > > This should clear the region cache. I needed to use this >> > because >> > > I >> > > > am >> > > > > > > > pre-splitting my tables for bulk load. >> > > > > > > > BTW I'm using HBase 0.94.2 >> > > > > > > > Good luck! >> > > > > > > > >> > > > > > > > >> > > > > > > > On Fri, Jul 12, 2013 at 6:50 PM, Rohit Kelkar < >> > > > [email protected] >> > > > > > >> > > > > > > > wrote: >> > > > > > > > >> > > > > > > > > I am having problems while scanning a table created using >> > > HFile. >> > > > > > > > > This is what I am doing - >> > > > > > > > > Once Hfile is created I use following code to bulk load >> > > > > > > > > >> > > > > > > > > LoadIncrementalHFiles loadTool = new >> > > LoadIncrementalHFiles(conf); >> > > > > > > > > HTable myTable = new HTable(conf, mytablename.getBytes()); >> > > > > > > > > loadTool.doBulkLoad(new Path(outputHFileBaseDir + "/" + >> > > > > mytablename), >> > > > > > > > > mytableTable); >> > > > > > > > > >> > > > > > > > > Then scan the table using- >> > > > > > > > > >> > > > > > > > > HTable table = new HTable(conf, mytable); >> > > > > > > > > Scan scan = new Scan(); >> > > > > > > > > scan.addColumn("cf".getBytes(), "q".getBytes()); >> > > > > > > > > ResultScanner scanner = table.getScanner(scan); >> > > > > > > > > for (Result rr = scanner.next(); rr != null; rr = >> > > > scanner.next()) { >> > > > > > > > > numRowsScanned += 1; >> > > > > > > > > } >> > > > > > > > > >> > > > > > > > > This code crashes with following error - >> > > > > > http://pastebin.com/SeKAeAST >> > > > > > > > > If I remove the scan.addColumn from the code then the code >> > > works. >> > > > > > > > > >> > > > > > > > > Similarly on the hbase shell - >> > > > > > > > > - A simple count 'mytable' in hbase shell gives the >> correct >> > > > count. >> > > > > > > > > - A scan 'mytable' gives correct results. >> > > > > > > > > - get 'mytable', 'myrow', 'cf:q' crashes >> > > > > > > > > >> > > > > > > > > The hadoop dfs -ls /hbase/mytable shows the .tableinfo, >> .tmp, >> > > the >> > > > > > > > directory >> > > > > > > > > for region etc. >> > > > > > > > > >> > > > > > > > > Now if I do a major_compact 'mytable' and then execute my >> > code >> > > > with >> > > > > > the >> > > > > > > > > scan.addColumn statement then it works. Also the get >> > 'mytable', >> > > > > > > 'myrow', >> > > > > > > > > 'cf:q' works. >> > > > > > > > > >> > > > > > > > > My question is >> > > > > > > > > What is major_compact doing to enable the scanner that the >> > > > > > > > > LoadIncrementalFiles tool is not? I am sure I am missing a >> > step >> > > > > after >> > > > > > > the >> > > > > > > > > LoadIncrementalFiles. >> > > > > > > > > >> > > > > > > > > - R >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > >
