Hello Ryan and the list, Well, I am still stuck. In addition to making the changes recommended by Ryan to my hadoop-site.xml file (see below), I also added a line for HBase to /etc/security/limits.conf and had the fs.file-max hugely increased, to hopefully handle any file handle limit problem. Still no luck with my upload program. It fails about where it did before, around the loading of the 160,000th row into the one table that I create in Hbase. Didn't the "too many file open" msg, but did get "handleConnectionFailure" in the same place in the upload. I then tried a complete reinstall of Hbase and Hadoop, upgrading from 0.19.0 to 0.19.1. Used the same config parameters as before, and reran the program. It fails again, at about the same number of rows uploaded - and I'm back to getting "too many files open" as what I think is the principal error msg. So - does anybody have any suggestions? I am running a "pseudo-distributed" installation of Hadoop on one Red Hat Linux machine with about ~3Gb of RAM. Are there any known problems with bulk uploads when running "pseudo-distributed" on on a single box, rather than a true cluster? Is there anything else I can try? Ron
___________________________________________ Ronald Taylor, Ph.D. Computational Biology & Bioinformatics Group Pacific Northwest National Laboratory 902 Battelle Boulevard P.O. Box 999, MSIN K7-90 Richland, WA 99352 USA Office: 509-372-6568 Email: [email protected] www.pnl.gov ________________________________ From: Ryan Rawson [mailto:[email protected]] Sent: Friday, April 03, 2009 5:56 PM To: Taylor, Ronald C Subject: Re: FW: Still need help with data upload into HBase Welcome to hbase :-) This is pretty much how it goes for nearly every new user. We might want to review our docs... On Fri, Apr 3, 2009 at 5:54 PM, Taylor, Ronald C <[email protected]> wrote: Thanks. I'll make those settings, too, in addition to bumping up the file handle limit, and give it another go. Ron -----Original Message----- From: Ryan Rawson [mailto:[email protected]] Sent: Friday, April 03, 2009 5:48 PM To: [email protected] Subject: Re: Still need help with data upload into HBase Hey, File handle - yes... there was a FAQ and/or getting started which talks about upping lots of limits. I have these set in my hadoop-site.xml (that is read by datanode): <property> <name>dfs.datanode.max.xcievers</name> <value>2047</value> </property> <property> <name>dfs.datanode.handler.count</name> <value>10</value> </property> I should probably set the datanode.handler.count higher. Don't forget to toss a reasonable amount of ram at hdfs... not sure what that is exactly, but -Xmx1000m wouldn't hurt. On Fri, Apr 3, 2009 at 5:44 PM, Taylor, Ronald C <[email protected]>wrote: > > Hi Ryan, > > Thanks for the info. Re checking the Hadoop datanode log file: I just > did so, and found a "too many open files" error. Checking the Hbase FAQ, > I see that I should drastically bump up the file handle limit. So I will > give that a try. > > Question: what does the xciver variable do? My hadoop-site.xml file does > not contain any entry for such a var. (Nothing reported in the datalog > file either with the word "xciver".) > > Re using the local file system: well, as soon as I load a nice data set > loaded in, I'm starting a demo project manipulating it for our Env > Molecular Sciences Lab (EMSL), a DOE Nat User Facility. And I'm supposed > to be doing the manipulating using MapReduce programs, to show the > usefulness of such an approach. So I need Hadoop and the HDFS. And so I > would prefer to keep using Hbase on top of Hadoop, rather than the local > Linux file system. Hopefully the "small HDFS clusters" issues you > mention are survivable. Eventually, some of this programming might wind > up on Chinook, our 160 Teraflop supercomputer cluster, but that's a ways > down the road. I'm starting on my Linux desktop. > > I'll try bumping up the file handle limit, restart Hadoop and Hbase, and > see what happens. > Ron > > ___________________________________________ > Ronald Taylor, Ph.D. > Computational Biology & Bioinformatics Group > Pacific Northwest National Laboratory > 902 Battelle Boulevard > P.O. Box 999, MSIN K7-90 > Richland, WA 99352 USA > Office: 509-372-6568 > Email: [email protected] > www.pnl.gov > > > -----Original Message----- > From: Ryan Rawson [mailto:[email protected]] > Sent: Friday, April 03, 2009 5:08 PM > To: [email protected] > Subject: Re: Still need help with data upload into HBase > > Hey, > > Can you check the datanode logs? You might be running into the dreaded > xciver limit :-( > > try upping the xciver in hadoop-site.xml... i run at 2048. > > -ryan > > -----Original Message----- > From: Ryan Rawson [mailto:[email protected]] > Sent: Friday, April 03, 2009 5:13 PM > To: [email protected] > Subject: Re: Still need help with data upload into HBase > > Non replicated yet is probably what you think - HDFS hasnt place blocks > on > more nodes yet. This could be due to the pseudo distributed nature of > your > set-up. I'm not familiar with that configuration, so I can't really say > more. > > If you only have 1 machine, you might as well just go with local files. > The > HDFS gets you distributed replication, but until you have many machines, > it > won't buy you anything and only cause problems, since small HDFS > clusters > are known to have issues. > > Good luck (again!) > -ryan > > On Fri, Apr 3, 2009 at 5:07 PM, Ryan Rawson <[email protected]> wrote: > > > Hey, > > > > Can you check the datanode logs? You might be running into the > dreaded > > xciver limit :-( > > > > try upping the xciver in hadoop-site.xml... i run at 2048. > > > > -ryan > > > > > > On Fri, Apr 3, 2009 at 4:35 PM, Taylor, Ronald C > <[email protected]>wrote: > > > >> > >> Hello folks, > >> > >> I have just tried using Ryan's doCommit() method for my bulk upload > into > >> one Hbase table. No luck. I still start to get errors around row > >> 160,000. On-screen, the program starts to generate error msgs like > so: > >> ... > >> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried 8 > >> time(s). > >> Apr 3, 2009 2:39:52 PM > >> org.apache.hadoop.hbase.ipc.HBaseClient$Connection > >> handleConnectionFailure > >> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried 9 > >> time(s). > >> Apr 3, 2009 2:39:57 PM > >> org.apache.hadoop.hbase.ipc.HBaseClient$Connection > >> handleConnectionFailure > >> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried 0 > >> time(s). > >> Apr 3, 2009 2:39:58 PM > >> org.apache.hadoop.hbase.ipc.HBaseClient$Connection > >> handleConnectionFailure > >> INFO: Retrying connect to server: /127.0.0.1:60383. Already tried 1 > >> time(s). > >> ... > >> In regard to log file information, I have appended at bottom some of > the > >> output from my hbase-<user>-master-<machine>.log file, at the place > >> where it looks to me like things might have started to go wrong. > Several > >> questions: > >> > >> 1) Is there any readily apparent cause for such a > >> HBaseClient$Connection handleConnectionFailure to occur in a Hbase > >> installation configured on a Linux desktop to work in the > >> pseudo-distributed operation mode? From my understanding, even > importing > >> ~200,000 rows (each row being filled with info for ten columns) is a > >> minimal data set for Hbase, and upload should not be failing like > this. > >> > >> FYI - minimal changes were made to the Hbase default settings in the > >> Hbase ../conf/ config files when I installed Hbase 0.19.0. I have one > >> entry in hbase-env.sh, to set JAVA_HOME, and one property entry in > >> hbase-site.xml, to set the hbase.rootdir. > >> > >> 2) My Linux box has about 3 Gb of memory. I left the HADOOP_HEAP and > >> HBASE_HEAP sizes at their default values, which I understand are > 1000Mb > >> each. Should I have changed either value? > >> > >> 3) I left the dfs.replication value at the default of "3" in the > >> hadoop-site.xml file, for my test of pseudo-distributed operation. > >> Should I have changed that to "1", for operation on my single > machine? > >> Downsizing to "1" would appear to me to negate trying out Hadoop in > the > >> pseudo-distributed operation mode, so I left the value "as is", but > did > >> I get this wrong? > >> > >> 4) In the log output below, you can see that Hbase starts to block > and > >> then unblock updates to my one Hbase table (called the > >> "ppInteractionTable", for protein-protein interaction table). A > little > >> later, a msg says that the ppInteractionTable has been closed. At > this > >> point, my program has *not* issued a command to close the table - > that > >> only happens at the end of the program. So - why is this happening? > >> > >> Also, near the end of my log extract, I get a different error msg: > >> NotReplicatedYetException. I have no idea what that means. Actually, > I > >> don't really have a grasp yet on what any of these error msgs is > >> supposed to tell us. So - once again, any help would be much > >> appreciated. > >> > >> Ron > >> > >> ___________________________________________ > >> Ronald Taylor, Ph.D. > >> Computational Biology & Bioinformatics Group > >> Pacific Northwest National Laboratory > >> 902 Battelle Boulevard > >> P.O. Box 999, MSIN K7-90 > >> Richland, WA 99352 USA > >> Office: 509-372-6568 > >> Email: [email protected] > >> www.pnl.gov > >> > >> > >> > >> -----Original Message----- > >> From: Taylor, Ronald C > >> Sent: Tuesday, March 31, 2009 5:48 PM > >> To: '[email protected]' > >> Cc: Taylor, Ronald C > >> Subject: Novice Hbase user needs help with data upload - gets a > >> RetriesExhaustedException, followed by NoServerForRegionException > >> > >> > >> Hello folks, > >> > >> This is my first msg to the list - I just joined today, and I am a > >> novice Hadoop/HBase programmer. I have a question: > >> > >> I have written a Java program to create an HBase table and then enter > a > >> number of rows into the table. The only way I have found so far to do > >> this is to enter each row one-by-one, creating a new BatchUpdate > >> updateObj for each row, doing about ten updateObj.put()'s to add the > >> column data, and then doing a tableObj.commit(updateObj). There's > >> probably a more efficient way (happy to hear, if so!), but this is > what > >> I'm starting with. > >> > >> When I do this on input that creates 3000 rows, the program works > fine. > >> When I try this on input that would create 300,000 rows (still > >> relatively small for an HBase table, I would think), the program > >> terminates around row 160,000 or so, generating first an > >> RetriesExhaustedException, followed by NoServerForRegionException. > The > >> HBase server crashes, and I have to restart it. The Hadoop server > >> appears to remain OK and does not need restarting. > >> > >> Can anybody give me any guidance? I presume that I might need to > adjust > >> some setting for larger input in the HBase and/or Hadoop config > files. > >> At present, I am using default settings. I have installed Hadoop > 0.19.0 > >> and HBase 0.19.0 in the "pseudo" cluster mode on a single machine, my > >> Red Hat Linux desktop, which has 3 Gb RAM. > >> > >> Any help / suggestions would be much appreciated. > >> > >> Cheers, > >> Ron Taylor
