Re: Tomb Stone Marker

2012-09-10 Thread Monish r
Hi, Thanks for the link . If the meta data information for a delete is part of key value , then when does this update happen When the region is re written by minor compaction. ? or Is the region re written for a set of batched deletes ? On Sun, Sep 9, 2012 at 6:42 PM, Doug Meil

Re: for CDH4.0, where can i find the hbase-default.xml file if using RPM install

2012-09-10 Thread Monish r
Hi, Try rpm -qlp *rpm_file_name.rpm* This will list all files in the rpm , from this u can know where hbase-default.xml is. On Sat, Sep 8, 2012 at 3:16 PM, John Hancock jhancock1...@gmail.com wrote: Huaxiang, This may not be the quickest way to find it, but if it's anywhere in your

Re: Regionservers are dead...How to make them live again

2012-09-10 Thread Monish r
Hi, Try checking the log files of both HDFS ( if it is used ) and HBase to find out why the region server is going down. If possible post the logs , i can have a look at it. On Mon, Sep 10, 2012 at 10:46 AM, iwannaplay games funnlearnfork...@gmail.com wrote: Its weird. I restarted

Re: Local debugging (possibly with Maven and HBaseTestingUtility?)

2012-09-10 Thread Ulrich Staudinger
Hi there, my AQ Master Server might be of interest to you I have an embedded HBase server in it, it's very very straight forward to use: http://activequant.org/uberjar.html What I essentially do is described here:

bulk loading regions number

2012-09-10 Thread Oleg Ruchovets
Hi , I am using bulk loading to write my data to hbase. I works fine , but number of regions growing very rapidly. Entering ONE WEEK of data I got 200 regions (I am going to save years of data). As a result job which writes data to HBase has REDUCERS number equals REGIONS number. So entering

Re: bulk loading regions number

2012-09-10 Thread Harsh J
Hi Oleg, If the root issue is a growing number of regions, why not control that instead of a way to control the Reducer count? You could, for example, raise the split-point sizes for HFiles, to not have it split too much, and hence have larger but fewer regions? Given that you have 10 machines,

Re: About Reloading Coprocessors

2012-09-10 Thread Sever Fundatureanu
If it is associated to a certain table, you only have to disable the table, reload coprocessor, enable the table. Regards, Sever On Wed, Sep 5, 2012 at 5:18 AM, Aaron Wong aw...@crunchyroll.com wrote: Hello all, I have an endpoint coprocessor running in HBase that I would like to modify. I

Re: Tomb Stone Marker

2012-09-10 Thread Doug Meil
Hi there... In this chapter... http://hbase.apache.org/book.html#datamodel .. it explains that the updates are just a view. There is a merge happening across CFs and versions (and delete-markers).. In this... http://hbase.apache.org/book.html#regions.arch 9.7.5.5. Compaction ... it

Re: bulk loading regions number

2012-09-10 Thread Marcos Ortiz
Well, the defaul value for a region is 256 MB, so, if you want to storage a lot of date, you should want to consider to increase that value. With the preSplit() method, you can control how to do this process. On 09/10/2012 04:45 AM, Oleg Ruchovets wrote: Great That is actually what I am

HBase aggregate query

2012-09-10 Thread iwannaplay games
Hi , I want to run query like select month(eventdate),scene,count(1),sum(timespent) from eventlog group by month(eventdate),scene in hbase.Through hive its taking a lot of time for 40 million records.Do we have any syntax in hbase to find its result?In sql server it takes around 9 minutes,How

Re: HBase aggregate query

2012-09-10 Thread Ted Yu
Hi, Are you able to get the number you want through hive log ? Thanks On Mon, Sep 10, 2012 at 7:03 AM, iwannaplay games funnlearnfork...@gmail.com wrote: Hi , I want to run query like select month(eventdate),scene,count(1),sum(timespent) from eventlog group by month(eventdate),scene

Re: HBase aggregate query

2012-09-10 Thread Srinivas Mupparapu
HBase only provides CRUD operations by means of Put/Get/Delete API and there is no built in SQL interface. Thanks, Srinivas M On Sep 10, 2012 9:03 AM, iwannaplay games funnlearnfork...@gmail.com wrote: Hi , I want to run query like select month(eventdate),scene,count(1),sum(timespent) from

Re: HBase aggregate query

2012-09-10 Thread iwannaplay games
its taking very long On Mon, Sep 10, 2012 at 7:34 PM, Ted Yu yuzhih...@gmail.com wrote: Hi, Are you able to get the number you want through hive log ? Thanks On Mon, Sep 10, 2012 at 7:03 AM, iwannaplay games funnlearnfork...@gmail.com wrote: Hi , I want to run query like

Hbase filter-SubstringComparator vs full text search indexing

2012-09-10 Thread Shengjie Min
In my case, I have all the log events stored in HDFS/hbase in this format: timestamp | priority | category | message body Given I have only 4 fields here, that limits my queries to only against these four. I am thinking about more advanced search like full text search the message body. well,

Re: HBase aggregate query

2012-09-10 Thread Doug Meil
Hi there, if there are common questions I'd suggest creating summary tables of the pre-aggregated results. http://hbase.apache.org/book.html#mapreduce.example 7.2.4. HBase MapReduce Summary to HBase Example On 9/10/12 10:03 AM, iwannaplay games funnlearnfork...@gmail.com wrote: Hi , I

HBase UI missing region list for active/functioning table

2012-09-10 Thread Norbert Burger
Hi all -- we're currently on cdh3u3 (0.90.4 + patches). I have one table in our cluster which seems to functioning fine (gets/puts/scans are all working), but for which no regions are listed on the UI. The table/regions exist in .META. Other tables in the same cluster show their regions list

答复: for CDH4.0, where can i find the hbase-default.xml file if using RPM install

2012-09-10 Thread huaxiang
Hi, I don't find the hbase-default.xml file using following command, any other way? To be clear, this hadoop was installed with CDH RPM package. Huaxiang [root@hadoop1 ~]# clear [root@hadoop1 ~]# rpm -qlp *rpm_file_name.rpm* [root@hadoop1 ~]# ^C [root@hadoop1 ~]# find / -name

Re: BigDecimalColumnInterpreter

2012-09-10 Thread Julian Wissmann
Hi, I haven't really gotten to working on this, since last wednesday. Checked readFields() and write() today, but don't really see, why I would need to reimplement those. Admittedly I'm not that into the whole Hbase codebase, yet, so there is a good chance I'm missing something, here. Also,

Re: Hbase filter-SubstringComparator vs full text search indexing

2012-09-10 Thread Jacques
Two cents below... On Mon, Sep 10, 2012 at 7:24 AM, Shengjie Min kelvin@gmail.com wrote: In my case, I have all the log events stored in HDFS/hbase in this format: timestamp | priority | category | message body Given I have only 4 fields here, that limits my queries to only against

Doubt in performance tuning

2012-09-10 Thread Ramasubramanian
Hi, Currently it takes 11 odd minutes to load 1.2 million record into hbase from hdfs. Can u pls share some tips to do the same in few seconds? We tried doing this in both pig script and in pentaho. Both are taking 11 odd minutes. Regards, Rams

Re: Doubt in performance tuning

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 9:58 AM, Ramasubramanian ramasubramanian.naraya...@gmail.com wrote: Hi, Currently it takes 11 odd minutes to load 1.2 million record into hbase from hdfs. Can u pls share some tips to do the same in few seconds? We tried doing this in both pig script and in pentaho.

Re: 答复: for CDH4.0, where can i find the hbase-default.xml file if using RPM install

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 9:02 AM, huaxiang huaxi...@asiainfo-linkage.com wrote: Hi, I don't find the hbase-default.xml file using following command, any other way? To be clear, this hadoop was installed with CDH RPM package. Is it not bundled inside the hbase-*.jar? St.Ack

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 8:33 AM, Norbert Burger norbert.bur...@gmail.com wrote: Hi all -- we're currently on cdh3u3 (0.90.4 + patches). I have one table in our cluster which seems to functioning fine (gets/puts/scans are all working), but for which no regions are listed on the UI. The

Re: bulk loading regions number

2012-09-10 Thread Harsh J
The decision can be made depending on the number of total regions you want deployed across your 10 machines, and the size you expect the total to be before you have to expand the size of cluster. Additionally add in a parallelism factor of say 5-10 (or more if you want) regions of the same table

Re: Getting ScannerTimeoutException even after several calls in the specified time limit

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 10:13 AM, Dhirendra Singh dps...@gmail.com wrote: I am facing this exception while iterating over a big table, by default i have specified caching as 100, i am getting the below exception, even though i checked there are several calls made to the scanner before it

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Srinivas Mupparapu
It scans .META. table just like any other table. I just tested it and it produced the expected output. Thanks, Srinivas M On Sep 10, 2012 12:19 PM, Stack st...@duboce.net wrote: On Mon, Sep 10, 2012 at 8:33 AM, Norbert Burger norbert.bur...@gmail.com wrote: Hi all -- we're currently on

Re: 答复: for CDH4.0, where can i find the hbase-default.xml file if using RPM install

2012-09-10 Thread Srinivas Mupparapu
I just installed HBase from .tar.gz file and I couldn't find that file either. Thanks, Srinivas M On Sep 10, 2012 11:03 AM, huaxiang huaxi...@asiainfo-linkage.com wrote: Hi, I don't find the hbase-default.xml file using following command, any other way? To be clear, this hadoop was

Re: 答复: for CDH4.0, where can i find the hbase-default.xml file if using RPM install

2012-09-10 Thread Harsh J
HBase in packaged form bundles the default XML only inside the HBase jar(s). You need to download a source package tarball to get the default XML otherwise. /usr/share/doc/hbase-0.92.1+67/hbase-default.xml The above looks right, you can use that as a reference. Looks to be installed via a docs

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 10:24 AM, Srinivas Mupparapu j2eearchit...@gmail.com wrote: It scans .META. table just like any other table. I just tested it and it produced the expected output. When you refresh the master UI, it makes a few lines in the master log. Are these the lines you posted?

Re: 答复: for CDH4.0, where can i find the hbase-default.xml file if using RPM install

2012-09-10 Thread Harsh J
Srinivas, In the source tarball, the file is at $HBASE_HOME/src/main/resources/hbase-default.xml On Mon, Sep 10, 2012 at 10:56 PM, Srinivas Mupparapu j2eearchit...@gmail.com wrote: I just installed HBase from .tar.gz file and I couldn't find that file either. Thanks, Srinivas M On Sep 10,

Re: More rows or less rows and more columns

2012-09-10 Thread Harsh J
Hey Mohit, See http://hbase.apache.org/book.html#schema.smackdown.rowscols On Mon, Sep 10, 2012 at 10:56 PM, Mohit Anchlia mohitanch...@gmail.com wrote: Is there any recommendation on how many columns one should have per row. My columns are 200 bytes. This will help me to decide if I should

Tracking down coprocessor pauses

2012-09-10 Thread Tom Brown
Hi, We have our system setup such that all interaction is done through co-processors. We update the database via a co-processor (it has the appropriate logic for dealing with concurrent access to rows), and we also query/aggregate via co-processor (since we don't want to send all the data over

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Norbert Burger
On Mon, Sep 10, 2012 at 1:24 PM, Srinivas Mupparapu j2eearchit...@gmail.com wrote: It scans .META. table just like any other table. I just tested it and it produced the expected output. I'm pretty sure Srinivas scanned .META. in his own environment, not mine. ;-) On Sep 10, 2012 12:19 PM,

Re: More rows or less rows and more columns

2012-09-10 Thread Mohit Anchlia
On Mon, Sep 10, 2012 at 10:30 AM, Harsh J ha...@cloudera.com wrote: Hey Mohit, See http://hbase.apache.org/book.html#schema.smackdown.rowscols Thanks! Is there a way in HBase to get the most recent inserted column? Or a way to sort columns such that I can manage how many columns I want to

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 10:33 AM, Norbert Burger norbert.bur...@gmail.com wrote: On Mon, Sep 10, 2012 at 1:24 PM, Srinivas Mupparapu j2eearchit...@gmail.com wrote: It scans .META. table just like any other table. I just tested it and it produced the expected output. I'm pretty sure Srinivas

Re: Doubt in performance tuning

2012-09-10 Thread Ramasubramanian
Hi, Will be helpful if u say specific things to look into. Pls help Regards, Rams On 10-Sep-2012, at 10:40 PM, Stack st...@duboce.net wrote: On Mon, Sep 10, 2012 at 9:58 AM, Ramasubramanian ramasubramanian.naraya...@gmail.com wrote: Hi, Currently it takes 11 odd minutes to load 1.2

Re: Hbase filter-SubstringComparator vs full text search indexing

2012-09-10 Thread Otis Gospodnetic
Hello, If you need to scan lots of log messages and process them use HBase (or Hive or Pig or simply HDFS+MR) If you need to query your data set by anything in the text of the log message, use ElasticSearch or Solr 4.0 or Sensei or just Lucene. Otis -- Search Analytics -

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Norbert Burger
On Mon, Sep 10, 2012 at 1:37 PM, Stack st...@duboce.net wrote: What version of hbase? We're on cdh3u3, 0.90.4 + patches. Can you disable and reenable the table? I will try disabling/re-enabling at the next opportunity. Perhaps that'll resolve that the issue, but this is a PROD cluster, so

Re: More rows or less rows and more columns

2012-09-10 Thread Harsh J
Versions is what you're talking about, and by default all queries return the latest version of updated values. On Mon, Sep 10, 2012 at 11:04 PM, Mohit Anchlia mohitanch...@gmail.com wrote: On Mon, Sep 10, 2012 at 10:30 AM, Harsh J ha...@cloudera.com wrote: Hey Mohit, See

Re: More rows or less rows and more columns

2012-09-10 Thread Mohit Anchlia
On Mon, Sep 10, 2012 at 10:59 AM, Harsh J ha...@cloudera.com wrote: Versions is what you're talking about, and by default all queries return the latest version of updated values. No actually I was asking if I have columns with qualifier: d,b,c,e can I store them sorted such that it is

Re: More rows or less rows and more columns

2012-09-10 Thread Harsh J
Ah, sorry for assuming that then. I don't know of a way to sort qualifiers. I haven't seen anyone do that or require it for unstructured data (i.e. a query like fetch me the latest qualifier added to this row). I suppose you can compare the last two versions to see what was changed, but I still

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 10:50 AM, Norbert Burger norbert.bur...@gmail.com wrote: On Mon, Sep 10, 2012 at 1:37 PM, Stack st...@duboce.net wrote: What version of hbase? We're on cdh3u3, 0.90.4 + patches. Can you disable and reenable the table? I will try disabling/re-enabling at the next

Re: BigDecimalColumnInterpreter

2012-09-10 Thread anil gupta
Hi Julian, I am using only cdh4 libraries. I use the jars present under hadoop and hbase installed dir. In my last email i gave you some more pointers. Try to follow them and see what happens. If then also it doesn't works for you, then i will try to write an utility to test

Re: Tracking down coprocessor pauses

2012-09-10 Thread Andrew Purtell
On Mon, Sep 10, 2012 at 10:32 AM, Tom Brown tombrow...@gmail.com wrote: I want to know more details about the specifics of those requests; Is there an API I can use that will allow my coprocessor requests to be tracked more functionally? Is there a way to hook into the UI so I can provide my

Re: Tracking down coprocessor pauses

2012-09-10 Thread Michael Segel
On Sep 10, 2012, at 12:32 PM, Tom Brown tombrow...@gmail.com wrote: We have our system setup such that all interaction is done through co-processors. We update the database via a co-processor (it has the appropriate logic for dealing with concurrent access to rows), and we also

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Norbert Burger
On Mon, Sep 10, 2012 at 2:17 PM, Stack st...@duboce.net wrote: Thanks. I was asking about the info:regioninfo column that prints out the HRegionInfo for each region. I was wondering if it included a status=offline attribute. You could try one region only and see if that makes a difference.

Re: Doubt in performance tuning

2012-09-10 Thread Michael Segel
Well, Lets actually skip a few rounds of questions... and start from the beginning. What does your physical cluster look like? On Sep 10, 2012, at 12:40 PM, Ramasubramanian ramasubramanian.naraya...@gmail.com wrote: Hi, Will be helpful if u say specific things to look into. Pls help

Re: HBase UI missing region list for active/functioning table

2012-09-10 Thread Stack
On Mon, Sep 10, 2012 at 12:05 PM, Norbert Burger norbert.bur...@gmail.com wrote: Mind putting up full listing in pastebin? Let me have a look. We could try a master restart too... so it refreshes its in-memory state. That might do it. St.Ack

Re: Tracking down coprocessor pauses

2012-09-10 Thread Tom Brown
Micheal, We are using HBase to track the usage of our service. Specifically, each client sends an update when they start a task, at regular intervals during the task, and an update when they finish a task (and then presumably they start another, continuing the cycle). Each user has various

Re: Put and Increment atomically

2012-09-10 Thread Jean-Daniel Cryans
Hi Pablo, It's currently not possible (like you saw). What's your use case? Maybe there's different/better way to achieve what you want to do? J-D On Mon, Sep 3, 2012 at 1:22 PM, Pablo Musa pa...@psafe.com wrote: Hey guys, I want to insert new columns into a row:fam and increment 2 of them

Re: 答复: for CDH4.0, where can i find the hbase-default.xml file if using RPM install

2012-09-10 Thread lars hofhansl
You want to look at the hbase-xxx.jar inside the .tar.gz archive. Tested with 0.94.1: $ tar -O -xf hbase-0.94.1.tar.gz *hbase-0.94.1.jar | jar -t | grep hbase-default.xmlhbase-default.xml It's there. :) -- Lars - Original Message - From: Srinivas Mupparapu j2eearchit...@gmail.com

java.io.IOException: Pass a Delete or a Put

2012-09-10 Thread Jothikumar Ekanath
Hi, Getting this error while using hbase as a sink. Error java.io.IOException: Pass a Delete or a Put at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:125) at

Re: Getting ScannerTimeoutException even after several calls in the specified time limit

2012-09-10 Thread Dhirendra Singh
i tried with a smaller caching i.e 10, it failed again, not its not really a big cell. this small cluster(4 nodes) is only used for Hbase, i am currently using hbase-0.92.1-cdh4.0.1. , could you just let me know how could i debug this issue ? aused by: