Thanks!!! Mission completed ;) On Wed, Sep 2, 2009 at 4:26 PM, Jean-Daniel Cryans <[email protected]>wrote:
> Well you configured it in hbase.rootdir, something like /hbase so you > need to do "./bin/hadoop dfs -ls /hbase" > > J-D > > On Wed, Sep 2, 2009 at 8:30 AM, Xine Jar<[email protected]> wrote: > > :) > > > > Since I am not seeing neither the ROOT nor the METADATA I am obviously on > > the wrong path. I thought it should be seen in the DFS where a mapreduce > > program takes its input file from and stores its output file. and the > > default for me is: > > > > *pc150:~/Desktop/hbase-0.19.3 # /root/Desktop/hadoop-0.19.1/bin/hadoop > dfs > > -ls > > Found 2 items > > drwxr-xr-x - root supergroup 0 2009-08-31 22:21 > /user/root/input > > drwxr-xr-x - root supergroup 0 2009-09-02 16:02 > /user/root/output > > > > *If there is another path could you please tell me where is it > configured? > > So that I can check it?!!! > > > > Thank you > > > > > > On Wed, Sep 2, 2009 at 12:28 PM, Jean-Daniel Cryans <[email protected] > >wrote: > > > >> Same drill. > >> > >> J-D > >> > >> On Wed, Sep 2, 2009 at 5:51 AM, Xine Jar<[email protected]> > wrote: > >> > Hallo, > >> > The theoretical concept of the table is clear for me. I am aware that > the > >> > writes are kept in memory in a buffer called memtable and whenever > this > >> > buffer reaches a threshold, the memtable is automatically flushed to > the > >> > disk. > >> > > >> > Now I have tried to flush the table by executing the following: > >> > > >> > *hbase(main):001:0> flush 'myTable' > >> > 0 row(s) in 0.2019 seconds > >> > > >> > hbase(main):002:0> describe 'myTable' > >> > {NAME => 'myTable', FAMILIES => [{NAME => 'cf', COMPRESSION => 'NONE', > >> > VERSIONS => '3', LENGTH => '2147483647' > >> > , TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]} > >> > * > >> > Q1-the expression "0 row(s) in 0.2019" means that it did not flush > >> > anything?!! > >> > >> Nah it's just the way we count the rows we show in the shell. In this > >> case we did not increment some counter so it shows "0 row", so it's a > >> UI bug. BTW describing your table won't tell you how many rows you > >> have or how many are still kept in the memtable. > >> > >> > > >> > Q2- IN_MEMORY=FALSE means that the table is not in memory? so is it in > >> the > >> > disk?!!! If it is so, I still cannot see it in the DFS when executing > >> > "bin/hadoop dfs -ls". > >> > >> This is a family-scope property that tell HBase to keep it always in > >> RAM (but also on disk, it's not ephemeral). In your case, that means > >> that HBase shouldn't do anything in particular for that family. > >> > >> Are you sure you are doing a ls at the right place in the filesystem? > >> Do you see the META and ROOT folder? Is there any data in your table? > >> You can do a "count" in the shell to make sure. > >> > >> > > >> > > >> > Thank you for taking look at that > >> > > >> > Regards, > >> > CJ > >> > > >> > On Tue, Sep 1, 2009 at 7:13 PM, Jean-Daniel Cryans < > [email protected] > >> >wrote: > >> > > >> >> Inline. > >> >> > >> >> J-D > >> >> > >> >> On Tue, Sep 1, 2009 at 1:05 PM, Xine Jar<[email protected]> > >> wrote: > >> >> > Thank you, > >> >> > > >> >> > while the answers on Q3 and Q4 were clear enough I still have some > >> >> problems > >> >> > with the first two questions. > >> >> > >> >> Good > >> >> > >> >> > > >> >> > -which entry in the hbase-default.xml allows me to check the size > of a > >> >> > tablet? > >> >> > >> >> Those are configuration parameters, not commands. A region will split > >> >> when a family gets that size. See > >> >> http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture#hregion for > more > >> >> info on splitting. > >> >> > >> >> > > >> >> > -In hadoop, I used to copy a file to the DFS by doing "bin/hadoop > dfs > >> >> > -copyFromLocal filesource fileDFS". > >> >> > Having this file in the DFS I could list it "bin/hadoop dfs -ls" > and > >> >> check > >> >> > its size by doing "bin/hadoop dfs -du fileDFS" > >> >> > But when I create an hbase table, this table does not appear in > the > >> DFS. > >> >> > Therefore the latter command gives an error it cannot find > >> >> > the table!! So how can I point to the folder of the table? > >> >> > >> >> Just make sure the table is flushed to disk, the writes are kept in > >> >> memory as described in the link I pasted for the previous question. > >> >> You can force that by going in the shell and issuing "flush 'table'" > >> >> where table replaced with the name of your table. > >> >> > >> >> > > >> >> > Regards, > >> >> > CJ > >> >> > > >> >> > > >> >> > On Tue, Sep 1, 2009 at 5:00 PM, Jean-Daniel Cryans < > >> [email protected] > >> >> >wrote: > >> >> > > >> >> >> Anwers inline. > >> >> >> > >> >> >> J-D > >> >> >> > >> >> >> On Tue, Sep 1, 2009 at 10:53 AM, Xine Jar< > [email protected]> > >> >> wrote: > >> >> >> > Hallo, > >> >> >> > I have a cluster of 6 nodes running hadoop0.19.3 and hbase > 0.19.1. > >> I > >> >> have > >> >> >> > managed to write small programs to test the settings and > everything > >> >> seems > >> >> >> to > >> >> >> > be fine. > >> >> >> > > >> >> >> > I wrote a mapreduce program reading a small hbase table (100 > rows, > >> one > >> >> >> > familiy colum, 6 columns) and summing some values. In my opinion > >> the > >> >> job > >> >> >> is > >> >> >> > slow, it > >> >> >> > is taking 19sec. I would like to look closer what is going, if > the > >> >> table > >> >> >> is > >> >> >> > plit into tablets or not ...Therefore I appreciate if someone > can > >> >> answer > >> >> >> my > >> >> >> > following questions: > >> >> >> > >> >> >> With that size, that's expected. You would be better off scanning > >> your > >> >> >> table directly instead, MapReduce has a startup cost and 19 > seconds > >> >> >> isn't that much. > >> >> >> > >> >> >> > > >> >> >> > > >> >> >> > *Q1 -Does the value of "hbase.hregion.max.filesize" in the > >> >> >> > hbase-default.xml indicate the maximum size of a tablet in > bytes? > >> >> >> > >> >> >> It's the maximum size of a family (in a region) in bytes. > >> >> >> > >> >> >> > > >> >> >> > Q2- How can I know the size of the hbase table I have created? > (I > >> >> guess > >> >> >> the > >> >> >> > "Describe" command from the shell does not provide it) > >> >> >> > >> >> >> Size as in disk space? You could use the hadoop dfs -du command on > >> >> >> your table's folder. > >> >> >> > >> >> >> > > >> >> >> > Q3- Is there a way to know the real number of tablets > constituting > >> my > >> >> >> table? > >> >> >> > >> >> >> In the Master's web UI, click on the name of your table. If you > want > >> >> >> to do that programmatically, you can indirectly do it by calling > >> >> >> HTable.getEndKeys() and the size of that array is the number of > >> >> >> regions. > >> >> >> > >> >> >> > > >> >> >> > Q4- Is there a way to get more information on the tablets > handeled > >> by > >> >> >> each > >> >> >> > regionserver? (their number, the rows constituting each tablet) > * > >> >> >> > >> >> >> In the Master's web UI, click on the region server you want info > for. > >> >> >> Getting the number of rows inside a region, for the moment, can't > be > >> >> >> done directly (requires doing a scan between the start and end > keys > >> of > >> >> >> a region and counting the number of rows you see). > >> >> >> > >> >> >> > > >> >> >> > Thank you for you help, > >> >> >> > CJ > >> >> >> > > >> >> >> > >> >> > > >> >> > >> > > >> > > >
