My HBase version is 0.94.12
2013/12/20 Tao Xiao <[email protected]> > Hi Ted, > You let me check the log of LoadIncrementalHFiles to see what was the > error from region server, but where is the log of LoadIncrementalHFiles? Is > it in written into the log of region server? It seems the region server > works well > > > > > 2013/12/19 Ted Yu <[email protected]> > >> From the stack trace posted I saw: >> >> org.apache.commons.logging.impl.Log4JLogger.error(Log4JLogger.java:257) >> at >> >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.tryAtomicRegionLoad( >> LoadIncrementalHFiles.java:577) >> >> Assuming 0.94 is used, line 577 at the tip of 0.94 is: >> LOG.warn("Attempt to bulk load region containing " >> + Bytes.toStringBinary(first) + " into table " >> >> But the following should be the corresponding line w.r.t. stack trace: >> } catch (IOException e) { >> LOG.error("Encountered unrecoverable error from region server", e); >> >> Tao: >> Can you check the log of LoadIncrementalHFiles to see what was the error >> from region server ? >> >> As Jieshan said, checking region server log would reveal something. >> >> Cheers >> >> >> On Tue, Dec 17, 2013 at 10:40 PM, Bijieshan <[email protected]> wrote: >> >> > It seems LoadIncrementalHFiles is still running. Can you run "jstack" >> on >> > 1 RegionServer process also? >> > >> > Which version are you using? >> > >> > Jieshan. >> > -----Original Message----- >> > From: Tao Xiao [mailto:[email protected]] >> > Sent: Wednesday, December 18, 2013 1:49 PM >> > To: [email protected] >> > Subject: Re: Why so many unexpected files like partitions_xxxx are >> created? >> > >> > I did jstack one such process and can see the following output in the >> > terminal, and I guess this info told us that the processes started by >> the >> > command "LoadIncrementalHFiles" never exit. Why didn't they exit after >> > finished running ? >> > >> > ... ... >> > ... ... >> > >> > "LoadIncrementalHFiles-0.LruBlockCache.EvictionThread" daemon prio=10 >> > tid=0x000000004129c000 nid=0x2186 in Object.wait() [0x00007f53f3665000] >> > java.lang.Thread.State: WAITING (on object monitor) >> > at java.lang.Object.wait(Native Method) >> > - waiting on <0x000000075fcf3370> (a >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache$EvictionThread) >> > at java.lang.Object.wait(Object.java:485) >> > at >> > >> > >> org.apache.hadoop.hbase.io.hfile.LruBlockCache$EvictionThread.run(LruBlockCache.java:631) >> > - locked <0x000000075fcf3370> (a >> > org.apache.hadoop.hbase.io.hfile.LruBlockCache$EvictionThread) >> > at java.lang.Thread.run(Thread.java:662) >> > >> > Locked ownable synchronizers: >> > - None >> > >> > "LoadIncrementalHFiles-3" prio=10 tid=0x00007f540ca55800 nid=0x2185 >> > runnable [0x00007f53f3765000] >> > java.lang.Thread.State: RUNNABLE >> > at java.io.FileOutputStream.writeBytes(Native Method) >> > at java.io.FileOutputStream.write(FileOutputStream.java:282) >> > at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) >> > - locked <0x0000000763e5af70> (a java.io.BufferedOutputStream) >> > at java.io.PrintStream.write(PrintStream.java:430) >> > - locked <0x0000000763d5b670> (a java.io.PrintStream) >> > at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202) >> > at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:263) >> > at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:106) >> > - locked <0x0000000763d6c6d0> (a java.io.OutputStreamWriter) >> > at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:116) >> > at java.io.OutputStreamWriter.write(OutputStreamWriter.java:203) >> > at java.io.Writer.write(Writer.java:140) >> > at org.apache.log4j.helpers.QuietWriter.write(QuietWriter.java:48) >> > at >> org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:317) >> > at org.apache.log4j.WriterAppender.append(WriterAppender.java:162) >> > at >> > org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251) >> > - locked <0x0000000763d5fb90> (a org.apache.log4j.ConsoleAppender) >> > at >> > >> > >> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66) >> > at org.apache.log4j.Category.callAppenders(Category.java:206) >> > - locked <0x0000000763d65fe8> (a org.apache.log4j.spi.RootLogger) >> > at org.apache.log4j.Category.forcedLog(Category.java:391) >> > at org.apache.log4j.Category.log(Category.java:856) >> > at >> > org.apache.commons.logging.impl.Log4JLogger.error(Log4JLogger.java:257) >> > at >> > >> > >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.tryAtomicRegionLoad(LoadIncrementalHFiles.java:577) >> > at >> > >> > >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$1.call(LoadIncrementalHFiles.java:316) >> > at >> > >> > >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles$1.call(LoadIncrementalHFiles.java:314) >> > at >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) >> > at java.lang.Thread.run(Thread.java:662) >> > >> > Locked ownable synchronizers: >> > - <0x000000075fe494c0> (a >> > java.util.concurrent.locks.ReentrantLock$NonfairSync) >> > >> > ... ... >> > ... ... >> > >> > "Reference Handler" daemon prio=10 tid=0x00007f540c138800 nid=0x2172 in >> > Object.wait() [0x00007f5401355000] >> > java.lang.Thread.State: WAITING (on object monitor) >> > at java.lang.Object.wait(Native Method) >> > - waiting on <0x0000000763d51078> (a java.lang.ref.Reference$Lock) >> > at java.lang.Object.wait(Object.java:485) >> > at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) >> > - locked <0x0000000763d51078> (a java.lang.ref.Reference$Lock) >> > >> > Locked ownable synchronizers: >> > - None >> > >> > "main" prio=10 tid=0x00007f540c00e000 nid=0x216a waiting on condition >> > [0x00007f54114ac000] >> > java.lang.Thread.State: WAITING (parking) >> > at sun.misc.Unsafe.park(Native Method) >> > - parking to wait for <0x000000075ea67310> (a >> > java.util.concurrent.FutureTask$Sync) >> > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156) >> > at >> > >> > >> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:811) >> > at >> > >> > >> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:969) >> > at >> > >> > >> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1281) >> > at >> java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:218) >> > at java.util.concurrent.FutureTask.get(FutureTask.java:83) >> > at >> > >> > >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.bulkLoadPhase(LoadIncrementalHFiles.java:326) >> > at >> > >> > >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:261) >> > at >> > >> > >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.run(LoadIncrementalHFiles.java:780) >> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> > at >> > >> > >> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles.main(LoadIncrementalHFiles.java:785) >> > >> > Locked ownable synchronizers: >> > - None >> > >> > "VM Thread" prio=10 tid=0x00007f540c132000 nid=0x2170 runnable >> > >> > "Gang worker#0 (Parallel GC Threads)" prio=10 tid=0x00007f540c01c800 >> > nid=0x216b runnable >> > >> > "Gang worker#1 (Parallel GC Threads)" prio=10 tid=0x00007f540c01e800 >> > nid=0x216c runnable >> > >> > "Gang worker#2 (Parallel GC Threads)" prio=10 tid=0x00007f540c020000 >> > nid=0x216d runnable >> > >> > "Gang worker#3 (Parallel GC Threads)" prio=10 tid=0x00007f540c022000 >> > nid=0x216e runnable >> > >> > "Concurrent Mark-Sweep GC Thread" prio=10 tid=0x00007f540c0b1000 >> > nid=0x216f runnable "VM Periodic Task Thread" prio=10 >> > tid=0x00007f540c16b000 nid=0x217a waiting on condition >> > >> > JNI global references: 1118 >> > >> > >> > 2013/12/18 Ted Yu <[email protected]> >> > >> > > Tao: >> > > Can you jstack one such process next time you see them hanging ? >> > > >> > > Thanks >> > > >> > > >> > > On Tue, Dec 17, 2013 at 6:31 PM, Tao Xiao <[email protected]> >> > > wrote: >> > > >> > > > BTW, I noticed another problem. I bulk load data into HBase every >> > > > five minutes, but I found that whenever the following command was >> > executed >> > > > hbase org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles >> > > > HFiles-Dir MyTable >> > > > >> > > > there is a new process called "LoadIncrementalHFiles" >> > > > >> > > > I can see many processes called "LoadIncrementalHFiles" using the >> > > > command "jps" in the terminal, why are these processes still there >> > > > even after the command that bulk load HFiles into HBase has finished >> > > > executing ? I have >> > > to >> > > > kill them myself. >> > > > >> > > > >> > > > 2013/12/17 Bijieshan <[email protected]> >> > > > >> > > > > Yes, it should be cleaned up. But not included in current code in >> > > > > my understanding. >> > > > > >> > > > > Jieshan. >> > > > > -----Original Message----- >> > > > > From: Ted Yu [mailto:[email protected]] >> > > > > Sent: Tuesday, December 17, 2013 10:55 AM >> > > > > To: [email protected] >> > > > > Subject: Re: Why so many unexpected files like partitions_xxxx are >> > > > created? >> > > > > >> > > > > Should bulk load task clean up partitions_xxxx upon completion ? >> > > > > >> > > > > Cheers >> > > > > >> > > > > >> > > > > On Mon, Dec 16, 2013 at 6:53 PM, Bijieshan <[email protected]> >> > > wrote: >> > > > > >> > > > > > > I think I should delete these files immediately after I have >> > > > > > > finished >> > > > > > bulk loading data into HBase since they are useless at that >> > > > > > time, >> > > > right ? >> > > > > > >> > > > > > Ya. I think so. They are useless once bulk load task finished. >> > > > > > >> > > > > > Jieshan. >> > > > > > -----Original Message----- >> > > > > > From: Tao Xiao [mailto:[email protected]] >> > > > > > Sent: Tuesday, December 17, 2013 9:34 AM >> > > > > > To: [email protected] >> > > > > > Subject: Re: Why so many unexpected files like partitions_xxxx >> > > > > > are >> > > > > created? >> > > > > > >> > > > > > Indeed these files are produced by >> > org.apache.hadoop.hbase.mapreduce. >> > > > > > LoadIncrementalHFiles in the directory specified by what >> > > > > > job.getWorkingDirectory() >> > > > > > returns, and I think I should delete these files immediately >> > > > > > after I have finished bulk loading data into HBase since they >> > > > > > are useless at that time, right ? >> > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > 2013/12/16 Bijieshan <[email protected]> >> > > > > > >> > > > > > > The reduce partition information is stored in this >> > > > > > > partition_XXXX >> > > > file. >> > > > > > > See the below code: >> > > > > > > >> > > > > > > HFileOutputFormat#configureIncrementalLoad: >> > > > > > > ..................... >> > > > > > > Path partitionsPath = new Path(job.getWorkingDirectory(), >> > > > > > > "partitions_" + >> > > > UUID.randomUUID()); >> > > > > > > LOG.info("Writing partition information to " + >> > > > > > > partitionsPath); >> > > > > > > >> > > > > > > FileSystem fs = partitionsPath.getFileSystem(conf); >> > > > > > > writePartitions(conf, partitionsPath, startKeys); >> > > > > > > ..................... >> > > > > > > >> > > > > > > Hoping it helps. >> > > > > > > >> > > > > > > Jieshan >> > > > > > > -----Original Message----- >> > > > > > > From: Tao Xiao [mailto:[email protected]] >> > > > > > > Sent: Monday, December 16, 2013 6:48 PM >> > > > > > > To: [email protected] >> > > > > > > Subject: Why so many unexpected files like partitions_xxxx are >> > > > created? >> > > > > > > >> > > > > > > I imported data into HBase in the fashion of bulk load, but >> > > > > > > after that I found many unexpected file were created in the >> > > > > > > HDFS >> > > directory >> > > > > > > of /user/root/, and they like these: >> > > > > > > >> > > > > > > /user/root/partitions_fd74866b-6588-468d-8463-474e202db070 >> > > > > > > /user/root/partitions_fd867cd2-d9c9-48f5-9eec-185b2e57788d >> > > > > > > /user/root/partitions_fda37b8a-a882-4787-babc-8310a969f85c >> > > > > > > /user/root/partitions_fdaca2f4-2792-41f6-b7e8-61a8a5677dea >> > > > > > > /user/root/partitions_fdd55baa-3a12-493e-8844-a23ae83209c5 >> > > > > > > /user/root/partitions_fdd85a3c-9abe-45d4-a0c6-76d2bed88ea5 >> > > > > > > /user/root/partitions_fe133460-5f3f-4c6a-9fff-ff6c62410cc1 >> > > > > > > /user/root/partitions_fe29a2b0-b281-465f-8d4a-6044822d960a >> > > > > > > /user/root/partitions_fe2fa6fa-9066-484c-bc91-ec412e48d008 >> > > > > > > /user/root/partitions_fe31667b-2d5a-452e-baf7-a81982fe954a >> > > > > > > /user/root/partitions_fe3a5542-bc4d-4137-9d5e-1a0c59f72ac3 >> > > > > > > /user/root/partitions_fe6a9407-c27b-4a67-bb50-e6b9fd172bc9 >> > > > > > > /user/root/partitions_fe6f9294-f970-473c-8659-c08292c27ddd >> > > > > > > ... ... >> > > > > > > ... ... >> > > > > > > >> > > > > > > >> > > > > > > It seems that they are HFiles, but I don't know why the were >> > > created >> > > > > > here? >> > > > > > > >> > > > > > > I bulk load data into HBase in the following way: >> > > > > > > >> > > > > > > Firstly, I wrote a MapReduce program which only has map >> tasks. >> > > The >> > > > > map >> > > > > > > tasks read some text data and emit them in the form of RowKey >> > > > > > > and KeyValue.The following is my program: >> > > > > > > >> > > > > > > @Override >> > > > > > > protected void map(NullWritable NULL, >> > > > > > > GtpcV1SignalWritable signal, Context ctx) throws >> > InterruptedException, IOException { >> > > > > > > String strRowkey = xxx; >> > > > > > > byte[] rowkeyBytes = Bytes.toBytes(strRowkey); >> > > > > > > >> > > > > > > rowkey.set(rowkeyBytes); >> > > > > > > >> > > > > > > part1.init(signal); >> > > > > > > part2.init(signal); >> > > > > > > >> > > > > > > KeyValue kv = new KeyValue(rowkeyBytes, Family_A, >> > > > > > > Qualifier_Q, part1.serialize()); >> > > > > > > ctx.write(rowkey, kv); >> > > > > > > >> > > > > > > kv = new KeyValue(rowkeyBytes, Family_B, >> > > > > > > Qualifier_Q, part2.serialize()); >> > > > > > > ctx.write(rowkey, kv); >> > > > > > > } >> > > > > > > >> > > > > > > >> > > > > > > after the MR programs finished, there were several HFiles >> > > > > > > generated in the output directory I specified. >> > > > > > > >> > > > > > > Then I bean to load these HFiles into HBase using the >> > > > > > > following >> > > > > command: >> > > > > > > hbase >> > > org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles >> > > > > > > HFiles-Dir MyTable >> > > > > > > >> > > > > > > Finally , I could see that the data were indeed loaded into >> > > > > > > the table in HBase. >> > > > > > > >> > > > > > > >> > > > > > > But, I could also see that there were many unexpected files >> > > > > > > generated in the HDFS directory of /user/root/, just as I >> > > > > > > have mentioned at the begining of this mail, and I did not >> > > > > > > specify any files to be produced in this directory. >> > > > > > > >> > > > > > > What happened ? Who can tell me what there files are and who >> > > > > > > produced >> > > > > > them? >> > > > > > > >> > > > > > > Thanks >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > >
