Re: hbase doubts

2015-08-18 Thread Shahab Yunus
One thought to ponder: If you are going to be splitting continuously and at a quicker pace, do you have a strategy/plan to merge old regions? Otherwise, you can end up with a cluster with proliferation of regions. Regards, Shahab On Tue, Aug 18, 2015 at 3:55 PM, Shushant Arora

Re: Hbase compound filter support

2015-08-17 Thread Shahab Yunus
is to build something on top of FilterList? -Original Message- From: Shahab Yunus [mailto:shahab.yu...@gmail.com] Sent: Monday, August 17, 2015 16:17 To: user@hbase.apache.org Subject: Re: Hbase compound filter support You can built nest Filter conditions using FilterLists

Re: Hbase compound filter support

2015-08-17 Thread Shahab Yunus
You can built nest Filter conditions using FilterLists of FilterList. FilterList implements the Filter interface so 'add' method will work. Have you checked that approach? Regards, Shahab On Mon, Aug 17, 2015 at 4:07 PM, kannan.ramanat...@barclays.com wrote: Hello, Is there a Java API

Re: hbase doubts

2015-07-16 Thread Shahab Yunus
My understanding is (please feel free to correct if I am wrong.): For your first question, I think more than efficiency TableOutputFormat provides you with a convenience of giving you a output format out of the box which can do the Puts for you with default recommended config settings like flush,

Re: splits and merge

2015-07-16 Thread Shahab Yunus
Ivan, Is it possible that you poll for number of regions (e.g. in a loop) after invoking split or merge to confirm that the action has been performed? I know it is a crude way but maybe something can be done in these lines. Or are you already doing this when you said 'look for Region Info'?

Re: [DISCUSS] correcting abusive behavior on mailing lists was (Re: [DISCUSS] Multi-Cluster HBase Client)

2015-07-01 Thread Shahab Yunus
I am a very new here and also my contribution to the mailing list has been limited as well. I am not even a committer. But I have been following and reading the mailing list for a while. So given that, I am taking the liberty and chiming in my 2 cents. I don't profess or claim to read other

Re: List of Puts in mapreduce java job

2015-05-19 Thread Shahab Yunus
The error is highlighting the issue. You can't output List of Puts like this. Your reducer output is Mutation and NOT a list of Mutation. I have handled this scenario by defining my own base abstract class: *public* *abstract* *class* TableReducerBatchPutsKEYIN, VALUEIN, KEYOUT *extends*

Re: List of Puts in mapreduce java job

2015-05-19 Thread Shahab Yunus
of TableMapReduceUtil.initTableReducerJob(inputTableName, Reduce.class, job); In fact, now is a Reduce Clas not a tableReducer, but i have the same error Error: java.lang.ClassCastException: java.util.ArrayList cannot be cast to org.apache.hadoop.hbase.client.Mutation 2015-05-19 15:14 GMT+02:00 Shahab Yunus

Re: Load data into hbase

2015-05-18 Thread Shahab Yunus
Lot of options depending upon your specifics of the usecase: In addition to Hive... You can use Sqoop http://www.dummies.com/how-to/content/importing-data-into-hbase-with-sqoop.html You can use Pig

Re: client Table instance, confused with autoFlush

2015-05-13 Thread Shahab Yunus
Until you move to HBase 1.*, you should use HTableInterface. And the autoFlush methods and semantics, as far as I understand are, same so you should not have problem. Regards, Shahab On Wed, May 13, 2015 at 11:09 AM, Serega Sheypak serega.shey...@gmail.com wrote: But HTable is deprecated in

Re: select/limit hanging on large tables

2015-05-12 Thread Shahab Yunus
In addition to Talat's comment about more info, you can check out the following properties: phoenix.query.queueSize phoenix.query.timeoutMs http://phoenix.apache.org/tuning.html We have set these in the hbase-site-.xml on the client machine where the Squirrel was running in the cases where the

Re: Mapping Over Cells

2015-05-11 Thread Shahab Yunus
You can specify the column family or column to read when you create the Scan object. Have you tried that? Does it make sense? Or I misunderstood your problem? http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#addColumn(byte[],%20byte[])

Re: scan startrow and stoprow

2015-04-22 Thread Shahab Yunus
I see that you are already using partial key scan approach. When you say that you want data from the timeStamp2 as well, you want all the rows from timeStamp2? Also, how are you actually setting the stopRow? What are you providing for the *anyNumber field when you are setting stopRow? e.g. if

Re: Job MapReduce to populate HBase Table

2015-04-13 Thread Shahab Yunus
For the null key you should use NullWritable class, as discussed here: http://stackoverflow.com/questions/16198752/advantages-of-using-nullwritable-in-hadoop Regards, Shahab On Mon, Apr 13, 2015 at 7:01 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Silvio, What is the key you

Re: Job MapReduce to populate HBase Table

2015-04-13 Thread Shahab Yunus
... JM 2015-04-13 7:46 GMT-04:00 Shahab Yunus shahab.yu...@gmail.com: For the null key you should use NullWritable class, as discussed here: http://stackoverflow.com/questions/16198752/advantages-of-using-nullwritable-in-hadoop Regards, Shahab On Mon, Apr 13, 2015 at 7

Re: Splitting up an HBase Table into partitions

2015-03-17 Thread Shahab Yunus
If you know the row key range of your data, then you can create splits points yourself and then use HBase api to actually make the splits. E.g. If you know that your row key (and it is a very contrived example) has a range of A - Z then you can decide on split points as every 5 th letter as your

Re: [ANNOUNCE] Apache HBase 1.0.0 is now available for download

2015-02-24 Thread Shahab Yunus
Congrats an thanks to everyone involved. A big milestone! HBase *1.0* Regards Shahab On Tue, Feb 24, 2015 at 2:24 PM, anil gupta anilgupt...@gmail.com wrote: Kudos to HBase Team. Read HA feature sounds exciting. ~Anil On Tue, Feb 24, 2015 at 10:37 AM, Rajeshbabu Chintaguntla

Re: Region balancing query

2015-02-13 Thread Shahab Yunus
many tables are there in your cluster ? Is the cluster balanced overall (in terms of number of regions per server) but this table is not ? What happens (check master log) when you issue 'balancer' command through shell ? Cheers On Fri, Feb 13, 2015 at 8:19 AM, Shahab Yunus shahab.yu

Region balancing query

2015-02-13 Thread Shahab Yunus
CDH 5.3 HBase 98.6 We are writing data to an HBase table through a M/R job. We pre split the table before each job run. The problem is that most of the regions end up on the same RS. This results in that one RS being severely overloaded and subsequent M/R jobs failing trying to write to the

Re: Region balancing query

2015-02-13 Thread Shahab Yunus
: bq. all the regions of this table were back on this same RS! Interesting. Please check master log around the time this RS was brought online. You can pastebin the relevant snippet. Thanks On Fri, Feb 13, 2015 at 8:55 AM, Shahab Yunus shahab.yu...@gmail.com wrote: Hi Ted. Yes

Re: Region balancing query

2015-02-13 Thread Shahab Yunus
. See if raising to 100 or 200 helps. On Fri, Feb 13, 2015 at 1:09 PM, Shahab Yunus shahab.yu...@gmail.com wrote: Yes, this sever hosts other regions from other tables as well. Regards Shahab On Fri, Feb 13, 2015 at 1:45 PM, Ted Yu yuzhih...@gmail.com wrote: Interesting, server7

Re: Region balancing query

2015-02-13 Thread Shahab Yunus
,60020,1423845018628 host regions from other table ? Cheers On Fri, Feb 13, 2015 at 10:27 AM, Shahab Yunus shahab.yu...@gmail.com wrote: Table name is: MYTABLE_RECENT_4W_V2 Pastebin snippet 1: http://pastebin.com/dQzMhGyP Pastebin snippet 2: http://pastebin.com/Y7ZsNAgF This is the master log

Behavior of Split method in HBaseAdmin

2015-02-04 Thread Shahab Yunus
If we programmatically split a table by async method in HBaseAdmin class and even after waiting for quite a while, the split does not happen (there is no difference in number of regions before and after the split call) and there is no error or an exception either, does it mean that there is

Re: Behavior of Split method in HBaseAdmin

2015-02-04 Thread Shahab Yunus
. St.Ack On Wed, Feb 4, 2015 at 12:04 PM, Shahab Yunus shahab.yu...@gmail.com wrote: If we programmatically split a table by async method in HBaseAdmin class and even after waiting for quite a while, the split does not happen (there is no difference in number of regions before and after

Re: Disabling the HBase table

2015-01-23 Thread Shahab Yunus
9 Lakh = 900,000 Regards, Shahab On Fri, Jan 23, 2015 at 11:44 AM, Ted Yu yuzhih...@gmail.com wrote: Can you pastebin master / region server log around the time table was disabled ? BTW can you rephrase '9 lakh records' ? I don't know how many records that is. Cheers On Thu, Jan 22,

Re: Help,use the importtsv tools

2014-12-04 Thread Shahab Yunus
You need to pre-split the table into regions as in this tool the number of reducers are driven by the number of regions in the target table. Read about it here: http://hbase.apache.org/book/perf.writing.html http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/ Regards, Shahab

Re: UI tool

2014-12-01 Thread Shahab Yunus
Yes you can. In fact in some of the vendor's distributions it comes with the standard installation. You can also use Hive and more elaborate, powerful but complex Phoenix. Regards, Shahab On Mon, Dec 1, 2014 at 6:15 PM, Jignesh Patel jigneshmpa...@gmail.com wrote: can we use Apache

Re: Current Deployment Sizes

2014-11-21 Thread Shahab Yunus
I think your best bet, to get the latest and accurate as possible data, would be to directly contact the companies (through their Engineering channels) which are known to host large clusters. Most of these companies have public blogs and such so should not be hard to find an appropriate contact.

Re: Hierarchy of filters and filters list

2014-11-18 Thread Shahab Yunus
, Shahab Yunus shahab.yu...@gmail.com wrote: Missed couple of things. 1- I am using SingleColumnValueFilter and the comparator is BinaryComparator which is passed into it. 2- CDH 5.1.0 (Hbase is 0.98.1-cdh5.1.0) Regards, Shahab On Tue, Nov 18, 2014 at 12:22 AM, Shahab Yunus

Re: Hierarchy of filters and filters list

2014-11-18 Thread Shahab Yunus
, 2014, at 8:06 AM, Shahab Yunus shahab.yu...@gmail.com wrote: You mean if used independently? Yes, they do. Regards, Shahab On Tue, Nov 18, 2014 at 10:51 AM, Ted Yu yuzhih...@gmail.com wrote: Have you verified that at least one of the following (when used alone) returns data

Hierarchy of filters and filters list

2014-11-17 Thread Shahab Yunus
Hi, I have data where each row has start and end time stored in UTC (long). The table is created through Phoenix and the columns have type UNSIGNED_DATE (which according to Phoenix docs http://phoenix.apache.org/language/datatypes.html#unsigned_date_type does Hbase.toBytes(long) underneath for 8

Re: Hierarchy of filters and filters list

2014-11-17 Thread Shahab Yunus
Missed couple of things. 1- I am using SingleColumnValueFilter and the comparator is BinaryComparator which is passed into it. 2- CDH 5.1.0 (Hbase is 0.98.1-cdh5.1.0) Regards, Shahab On Tue, Nov 18, 2014 at 12:22 AM, Shahab Yunus shahab.yu...@gmail.com wrote: Hi, I have data where each row

Forcibly merging regions

2014-11-14 Thread Shahab Yunus
The documentation of online merge tool (merge_region) states that if we forcibly merge regions (by setting the 3rd attribute as true) then it can create overlapping regions. if this happens then will this render the region or table unusable or it is just a performance hit? I mean how bigger of a

Re: Forcibly merging regions

2014-11-14 Thread Shahab Yunus
a look at master log around the time merge request was issued to see if you can get some clue ? Cheers On Fri, Nov 14, 2014 at 6:41 AM, Shahab Yunus shahab.yu...@gmail.com wrote: The documentation of online merge tool (merge_region) states that if we forcibly merge regions (by setting

Re: Forcibly merging regions

2014-11-14 Thread Shahab Yunus
FYI, Ted, I see this exact similar issue being discussed in the past here as well: http://mail-archives.apache.org/mod_mbox/hbase-user/201406.mbox/%3CCAKrkF=thi8g4Ks=viqgC+Y=ivuqysogoq41rmkutfriunal...@mail.gmail.com%3E Regards, Shahab On Fri, Nov 14, 2014 at 11:35 AM, Shahab Yunus shahab.yu

Re: Forcibly merging regions

2014-11-14 Thread Shahab Yunus
in your master log: LOG.error(Merged region + region.getRegionNameAsString() + has only one merge qualifier in META.); It would be the case that 7373f75181c71eb5061a6673cee15931 still had reference file. Cheers On Fri, Nov 14, 2014 at 8:35 AM, Shahab Yunus shahab.yu

Re: Forcibly merging regions

2014-11-14 Thread Shahab Yunus
Yu yuzhih...@gmail.com wrote: One possibility was that region 7373f75181c71eb5061a6673cee15931 was involved in some hbase snapshot. Was the underlying table being snapshotted in recent past ? Cheers On Fri, Nov 14, 2014 at 9:05 AM, Shahab Yunus shahab.yu...@gmail.com wrote: Thanks again

Re: Forcibly merging regions

2014-11-14 Thread Shahab Yunus
I just checked. No snapshots were taken and 'list_snapshots' also returns nothing. Regards, Shahab On Fri, Nov 14, 2014 at 12:39 PM, Shahab Yunus shahab.yu...@gmail.com wrote: No. Not that I can recall but I can check. From resolution perspective, is there any way we can resolve this. More

Re: Forcibly merging regions

2014-11-14 Thread Shahab Yunus
( this.services.getConfiguration(), fs, tabledir, mergedRegion, true ); ... Then regionFs.hasReferences(htd) would tell you whether the underlying region has reference files. Cheers On Fri, Nov 14, 2014 at 9:39 AM, Shahab Yunus shahab.yu...@gmail.com wrote: No. Not that I can recall

Re: Forcibly merging regions

2014-11-14 Thread Shahab Yunus
Yesterday, I believe. Regards, Shahab On Fri, Nov 14, 2014 at 1:07 PM, Ted Yu yuzhih...@gmail.com wrote: Shahab: When was the last time compaction was run on this table ? Cheers On Fri, Nov 14, 2014 at 9:58 AM, Shahab Yunus shahab.yu...@gmail.com wrote: I see. Thanks

Re: Forcibly merging regions

2014-11-14 Thread Shahab Yunus
/Server.java#Server.getCatalogTracker%28%29(), mergedRegion); return true; Do you think it is ok if we face this issue then we forcibly archive and clean the regions ? Regards, Shahab On Fri, Nov 14, 2014 at 1:10 PM, Shahab Yunus shahab.yu...@gmail.com wrote: Yesterday, I believe. Regards

Re: Forcibly merging regions

2014-11-14 Thread Shahab Yunus
After major compacting the references were freed for the above mentioned regions and then the merge_region command succeeded and they got merged. Hmmm. Regards, Shahab On Fri, Nov 14, 2014 at 2:08 PM, Shahab Yunus shahab.yu...@gmail.com wrote: Digging deeper into the code, I came across

Re: Forcibly merging regions

2014-11-14 Thread Shahab Yunus
that it is time for major compaction. Cheers On Fri, Nov 14, 2014 at 11:31 AM, Shahab Yunus shahab.yu...@gmail.com wrote: After major compacting the references were freed for the above mentioned regions and then the merge_region command succeeded and they got merged. Hmmm. Regards, Shahab

Re: Forcibly merging regions

2014-11-14 Thread Shahab Yunus
compaction on selected region, see: public void majorCompactRegion(final byte[] regionName) Cheers On Fri, Nov 14, 2014 at 11:49 AM, Shahab Yunus shahab.yu...@gmail.com wrote: I see. Thanks. So we can in a way automate this resolution by invoking major compaction programmatically

Re: range scan based middle of rowkey

2014-11-06 Thread Shahab Yunus
I think you have to make parallel multiple queries and combine the result on client side. Something like this is doing in its implementation: http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/ Regards, Shahab On Thu, Nov 6,

Re: range scan based middle of rowkey

2014-11-06 Thread Shahab Yunus
need to re-import all data again? Thanks! On Thursday, November 6, 2014, Shahab Yunus shahab.yu...@gmail.com wrote: I think you have to make parallel multiple queries and combine the result on client side. Something like this is doing in its implementation: http://blog.sematext.com

Re: How can I set the num of mappers when I use hbase RowCounter on Yarn?

2014-10-20 Thread Shahab Yunus
Have you tried setting the following property through the command line? -D mapreduce.job.mappers Regards, Shahab On Mon, Oct 20, 2014 at 2:24 AM, liub...@inspur.com liub...@inspur.com wrote: Hello, I used hbase Rowcounter on yarn , but the num of mappers was 1, and the progress was 0%.

Re: Help: RegionTooBusyException: failed to get a lock in 60000 ms

2014-09-05 Thread Shahab Yunus
For monotonically increasing data: can you try to do pre-splitting of the destination table? That can help in avoiding one region getting overloaded at the time of bulkimport. Regards, Shahab On Fri, Sep 5, 2014 at 12:14 PM, Jianshi Huang jianshi.hu...@gmail.com wrote: Thanks Ted, I'll try to

Re: state-of-the-art method for merging regions on v0.94

2014-08-28 Thread Shahab Yunus
I have a question here. In 0.98 the merge_region command which can be run through HBase shell is not reliable? If we simply want to merge 2 regions at a time? I thought that the older Merge tool was not safe. Thanks, Shahab On Thu, Aug 28, 2014 at 2:26 PM, Bryan Beaudreault

Re: state-of-the-art method for merging regions on v0.94

2014-08-28 Thread Shahab Yunus
-28 14:33 GMT-04:00 Shahab Yunus shahab.yu...@gmail.com: I have a question here. In 0.98 the merge_region command which can be run through HBase shell is not reliable? If we simply want to merge 2 regions at a time? I thought that the older Merge tool was not safe. Thanks, Shahab

Re: writing to multiple hbase tables in a mapreduce job

2014-08-26 Thread Shahab Yunus
You don't need to initialize the tables. You just need to specify the output format as MultipleTableOutputFormat class. Something like this: job.setOutputFormatClass(MultipleTableOutputFormat.class); Because if you see the code for MultipleTableOutputFormat, it creates the table on the fly and

Re: writing to multiple hbase tables in a mapreduce job

2014-08-26 Thread Shahab Yunus
); boolean b = job.waitForCompletion(true); if (!b) { throw new IOException(error with job!); } i am unable to figure out, what i am missing here, -yeshwanth On Wed, Aug 27, 2014 at 12:23 AM, Shahab Yunus shahab.yu...@gmail.com wrote: You don't need to initialize the tables

Re: writing to multiple hbase tables in a mapreduce job

2014-08-26 Thread Shahab Yunus
: that mapreduce job reads data from hbase table, it doesn't take any explicit input data/file/ -yeshwanth On Wed, Aug 27, 2014 at 12:44 AM, Shahab Yunus shahab.yu...@gmail.com wrote: Where are you setting the input data/path/format for the job? I don't see that in the code below

Re: writing to multiple hbase tables in a mapreduce job

2014-08-26 Thread Shahab Yunus
, job);//otherArgs[0]=i1 TableMapReduceUtil.initTableReducerJob(otherArgs[0], null, job); Ted suggested to remove them, if u see the first message in this thread, u will know the issue by specifying the table. -yeshwanth On Wed, Aug 27, 2014 at 12:54 AM, Shahab Yunus shahab.yu...@gmail.com

Re: Splitting an existing table with new keys.

2014-08-20 Thread Shahab Yunus
19, 2014 at 7:00 PM, Ted Yu yuzhih...@gmail.com wrote: My suggestion wasn't about pre-splitting. You can insert dummy values as part of your proof-of-concept code - before admin.split() is called. On Tue, Aug 19, 2014 at 3:50 PM, Shahab Yunus shahab.yu...@gmail.com wrote: Ted, Hmmm

Splitting an existing table with new keys.

2014-08-19 Thread Shahab Yunus
I have a table already created and with some data. I want to split it trough code using HBaseAdmin api into multiple regions, while specifying keys that do not exist in the table. I am getting the exception below which makes sense because the key doesn't exist yet. But at the time of creation of

Re: Splitting an existing table with new keys.

2014-08-19 Thread Shahab Yunus
...@spaggiari.org wrote: Hi Shahab, can you sahre your code? Seems that the RS you reached did not have the expected region. How is your table status in the web interface? JM 2014-08-19 16:11 GMT-04:00 Shahab Yunus shahab.yu...@gmail.com: I have a table already created and with some data

Re: Splitting an existing table with new keys.

2014-08-19 Thread Shahab Yunus
whose keys correspond to the splits ? Cheers On Tue, Aug 19, 2014 at 1:29 PM, Shahab Yunus shahab.yu...@gmail.com wrote: So the situation here is that we are trying to bulk load data in to a table. But each load of data has such range of keys that it will go to a specific continuous chunk

Relationship between number of reducers and number of regions in the table

2014-08-14 Thread Shahab Yunus
I couldn't decide that whether it is an HBase question or Hadoop/Yarn. In the utility class for MR jobs integerated with HBase, *org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil, * in the method: *public static void initTableReducerJob(String table,* *Class? extends TableReducer

Re: How to get specific rowkey from hbase

2014-08-11 Thread Shahab Yunus
You can use the util classes provided already. Note that it won't be very fast and you might want to try out bulk import as well (especially if it is one time or rare occurrence.) It depends on your use case. Check out the documentation below: For the Map Reduce Hbase util:

Re: Hbase Mapreduce API - Reduce to a file is not working properly.

2014-08-02 Thread Shahab Yunus
Parkirat, This is a core Java concept which is mainly related to how Class inheritance works in Java and how the @Override annotation is used, and is not Hadoop specific. (It is also used while implementing interfaces since JDK 6.) You can read about it here:

Re: Hbase Mapreduce API - Reduce to a file is not working properly.

2014-08-01 Thread Shahab Yunus
Add @Override notation at top of the 'reduce' method and then try (just like you are doing for the 'map' method): public class WordCountReducer extends ReducerText, IntWritable, Text, IntWritable { *@Override* protected void reduce(Text key, IterableIntWritable values,

Re: Hbase MR Job with 2 OutputForm classes possible?

2014-07-30 Thread Shahab Yunus
There is a trick. You can use MultipleOutputs with TableMapReduceUtil. In the Reducer you can write to desired outputs on HDFS using MultipleOutputs and the HBase Util will do its work as is. Only caveat is that, you will have to commit the files that you have written using MultipleOutputs

Re: how can I use sqoop transfer data from mysql to hbase

2014-07-22 Thread Shahab Yunus
Can you explain a bit that what issue you are facing? Sqoop's documentation explains quite clearly how to import data from MySQL to Hbase. You can use those commands in script to automate the process. http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_selecting_the_data_to_import Regards,

Re: when to use hive vs hbase

2014-04-30 Thread Shahab Yunus
HIve and HBase are 2 different tools/technologies. They are used together but hey are not interchangeable. HIve is for on-demand, RDMS SQL like data access while HBase is the actual data store. Hive runs on HBase providing a on-demand, SQL like API. Regards, Shahab On Wed, Apr 30, 2014 at 4:34

Re: test compression in hbase

2014-03-25 Thread Shahab Yunus
It says: RemoteException(java.io.IOException): /hbase/test is non empty Is the directory empty or are there files form some previous runs? Does the user have access to delete the data here? Regards, Shahab On Tue, Mar 25, 2014 at 7:42 AM, Mohamed Ghareb m.ghar...@tedata.netwrote: How I can

Re: ExportSnapshot using webhdfs

2014-03-21 Thread Shahab Yunus
Also Matteo, just like distcp, one advantage of this (using webhdfs while copying) could also be that even if the versions are not same, we can still copy? Regards, Shahab On Fri, Mar 21, 2014 at 8:14 AM, Matteo Bertozzi theo.berto...@gmail.comwrote: ExportSnapshot uses the FileSystem API so

Re: LSM tree, SSTable and fractal tree

2014-02-28 Thread Shahab Yunus
http://www.slideshare.net/jaxlondon2012/hbase-advanced-lars-george http://hortonworks.com/hadoop/hbase/ Regards, Shahab On Fri, Feb 28, 2014 at 8:36 AM, Vimal Jain vkj...@gmail.com wrote: Hi, Which one of the storage structure does hbase uses ? Is it LSM tree , SSTable or fractal tree ?

Re: question about Hmaster web console

2014-02-07 Thread Shahab Yunus
Also adding to what Ted mentioned, the following book has more details:

Re: copy data inter-cluster with different version of Hadoop

2013-10-28 Thread Shahab Yunus
Take a lookat using webHDFS protocol to use distcp between clusters with different versions: On Mon, Oct 28, 2013 at 3:14 PM, S. Zhou myx...@yahoo.com wrote: I need to copy data from Hadoop cluster A to cluster B. I know I can use distCp tool to do that. Now the problem is: cluster A has

Re: copy data inter-cluster with different version of Hadoop

2013-10-28 Thread Shahab Yunus
Sorry, last email was accidentally sent before I can finish it. Take a look at using webHDFS protocol to use distcp between clusters with different versions:

Re: Write TimeSeries Data and Do Time Based Range Scans

2013-09-24 Thread Shahab Yunus
I'm only know of the links already embedded in the blog page that I sent you or you have this. https://groups.google.com/forum/#!forum/hbasewd Regards, Shahab On Tue, Sep 24, 2013 at 11:12 AM, anil gupta anilgupt...@gmail.com wrote: Inline On Mon, Sep 23, 2013 at 6:15 PM, Shahab Yunus

Re: Loading hbase-site.xml settings from Hadoop MR job

2013-09-23 Thread Shahab Yunus
From where are you running your job? From which machine? This client machine from where you are kicking of this job should have the hbase-site.xml with the correct ZK info in it. It seems that your client/job is having and issue picking up the right ZK, rather than the services running on your

Re: Write TimeSeries Data and Do Time Based Range Scans

2013-09-23 Thread Shahab Yunus
http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/ Here you can find the discussion, trade-offs and working code/API (even for M/R) about this and the approach you are trying out. Regards, Shahab On Mon, Sep 23, 2013 at 5:41

Re: Write TimeSeries Data and Do Time Based Range Scans

2013-09-23 Thread Shahab Yunus
:51 PM, Shahab Yunus shahab.yu...@gmail.com wrote: http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/ Here you can find the discussion, trade-offs and working code/API (even for M/R) about this and the approach you

Re: hdfs data into Hbase

2013-09-09 Thread Shahab Yunus
Some quick thoughts, well your size is bound to increase because recall that the rowkey is stored in every cell. So when in CSV if you have let us say 5 columns and when you imported them to HBASE using the first column as key, then you will end up with essentially 9 (1 for the rowkey and then 2

Re: HBase MR - key/value mismatch

2013-09-06 Thread Shahab Yunus
context) So the type parameters above should facilitate this. Take a look at the PutCombiner from HBase source code: public class PutCombinerK extends ReducerK, Put, K, Put { Cheers On Thu, Sep 5, 2013 at 9:46 AM, Shahab Yunus shahab.yu...@gmail.com wrote: Ted, Might be a something

Re: HBase MR - key/value mismatch

2013-09-05 Thread Shahab Yunus
Try using Bytes.toBytes(your string) rather than String.getBytes. Regards, Shahab On Thu, Sep 5, 2013 at 2:16 AM, Omkar Joshi omkar.jo...@lntinfotech.comwrote: I'm trying to execute a MR code over stand-alone HBase(0.94.11). I had read the HBase api and modified my MR code to read data and

Re: user action modeling

2013-09-05 Thread Shahab Yunus
Your read queries seem to be more driven form the 'action' and 'object' perspective, rather than user. 1- So one option is that you make a composite key with action and object: action|object and the columns are users who are generating events on this combination. You can scan using prefix filter

Re: Programming practices for implementing composite row keys

2013-09-05 Thread Shahab Yunus
AM, Shahab Yunus shahab.yu...@gmail.com wrote: My 2 cents: 1- Yes, that is one way to do it. You can also use fixed length for every attribute participating in the composite key. HBase scan would be more fitting to this pattern as well, I believe (?) It's a trade-off basically between

Re: HBase MR - key/value mismatch

2013-09-05 Thread Shahab Yunus
Ted, Might be a something very basic that I am missing but why should OP's reducer's key be of type ImmutableBytesWritable if he is emitting Text in the mapper? Thanks. protected void map( ImmutableBytesWritable key, Result value,

Re: HBase - stable versions

2013-09-04 Thread Shahab Yunus
This maybe a newbie or dumb question but I believe, this does not affect or apply to HBase distributions by other vendors like HortonWorks or Cloudera. If someone is using one of the versions of distributions provided by them then it is up to them (and not people and community here) what and till

Re: Hbase RowKey design schema

2013-08-29 Thread Shahab Yunus
What advantage you will be gaining by compressing? Less space? But then it will add compression/decompression performance overhead. A trade-off but a especially significant as space is cheap and redundancy is OK with such data stores. Having said that, more importantly, what are your read

Re: java.io.IOException: Type mismatch in value from map: expected org.apache.hadoop.hbase.client.Put, recieved org.apache.hadoop.io.BytesWritable

2013-08-29 Thread Shahab Yunus
the keys as NullWritable or LongWritable i.e. by keeping the same types of keys, I am getting the same error. I don't think the error is at Map Input side. Its saying value from map. Can't understand where I am going wrong. Regards Praveenesh On Thu, Aug 29, 2013 at 4:58 PM, Shahab Yunus

Re: java.io.IOException: Type mismatch in value from map: expected org.apache.hadoop.hbase.client.Put, recieved org.apache.hadoop.io.BytesWritable

2013-08-29 Thread Shahab Yunus
Exactly I had the same though as Ashwanth too, that is why I asked whether @Override annotation is being used or not. Regards, Shahab On Thu, Aug 29, 2013 at 1:09 PM, Ashwanth Kumar ashwanthku...@googlemail.com wrote: Hey Praveenesh, I am not sure if this would help. But can you try moving

Re: how to export data from hbase to mysql?

2013-08-28 Thread Shahab Yunus
Taking what Ravi Kiran mentioned a level higher, you can also use Pig. It has DBStorage. Very easy to rad from HBase and dump to MySQL if your data porting does not require complex transformation (even which can be handled in Pig too.)

Re: one column family but lots of tables

2013-08-22 Thread Shahab Yunus
*do i understand it correctly that when i create lots of tables, but they all use the same column family (by name), that i am just using one column **family *and i am OK with respect to limiting number of column families ? I don't think so. Column families are per table. Even if the name of the

Re: Java Null Pointer Exception!

2013-08-19 Thread Shahab Yunus
Can you please explain or show the flow of the code a bit more? Why are you create the HTable object again and again in the mapper? Where is ContentidxTable (the name of the table, I believe?) defined? What is your actually requirement? Also, have you looked into this, the api for wiring HBase

Re: Java Null Pointer Exception!

2013-08-19 Thread Shahab Yunus
, Aug 19, 2013 at 7:05 PM, Shahab Yunus shahab.yu...@gmail.com wrote: Can you please explain or show the flow of the code a bit more? Why are you create the HTable object again and again in the mapper? Where is ContentidxTable (the name of the table, I believe?) defined? What is your

Re: How to push modified hbase-site.xml to RS using Cloudera Manager?

2013-08-08 Thread Shahab Yunus
As for your second question, I think, the hbase-site.xml file with default options is placed in the hbase jar files/libs on RS. Someone can correct me if I am wrong. Regards, Shahab On Thu, Aug 8, 2013 at 6:30 PM, Kim Chew kchew...@gmail.com wrote: Hello there, As titled. Also I would

Re: ETL like merge databases to HBase

2013-08-01 Thread Shahab Yunus
Though it is better as Ted suggested to discuss this in Sqoop mailing list (as Sqoop 2 supposed to be more feature rich) but just to get this out, Sqoop does support incremental imports if you can come up with a suitable and compatible strategy. Tha tmigh thelp you if you configure you imports on

Re: How to get row key from row value?

2013-07-31 Thread Shahab Yunus
Please correct me if I am wrong but I think there is as such no hard and fast technique for it. There are no constructs or method to this specifically in HBase. Your client while writing have to make sure to write to both tables: 1) the main table 2) and the secondary index table. Basically it is

Re: How to get row key from row value?

2013-07-31 Thread Shahab Yunus
is more of a temporary table which I can just do a get operation and get the value? Am I right? Regards, Pavan On Jul 31, 2013 7:24 PM, Shahab Yunus shahab.yu...@gmail.com wrote: Please correct me if I am wrong but I think there is as such no hard and fast technique

Re: How to join 2 tables using hadoop?

2013-07-20 Thread Shahab Yunus
wrote: If i have the value of a row in JSON format, would pig we able to parse it and join the fields as per my needs? On Fri, Jul 19, 2013 at 10:00 PM, Shahab Yunus shahab.yu...@gmail.comjavascript:; wrote: You can also look into Pig, if you already haven't. It supports various kinds

Re: How to join 2 tables using hadoop?

2013-07-19 Thread Shahab Yunus
You can also look into Pig, if you already haven't. It supports various kinds of joins and is simpler than writing your own M/R job (assuming that you don't have complex or custom requirements.) Regards, Shahab On Fri, Jul 19, 2013 at 12:24 AM, Pavan Sudheendra pavan0...@gmail.comwrote: Hi,

Re: Using separator/delimiter in HBase rowkey?

2013-07-08 Thread Shahab Yunus
Not saying this is a solution or better in anyway but just more food for thought. Is there any maximum size limit for UserIds? You can pad also for Users Ids of smaller length. You are using more space in this way though. It can help in sorting as well. Regards, Shahab On Mon, Jul 8, 2013 at

Re: Adding a new region server or splitting an old region in a Hash-partitioned HBase Data Store

2013-06-27 Thread Shahab Yunus
I think you will need to update your hash function and redistribute data. As far as I know this has been on of the drawbacks of this approach (and the SemaText library) Regards, Shahab On Wed, Jun 26, 2013 at 7:24 PM, Joarder KAMAL joard...@gmail.com wrote: May be a simple question to answer

Re: Adding a new region server or splitting an old region in a Hash-partitioned HBase Data Store

2013-06-27 Thread Shahab Yunus
to guide me through any reference which can confirm this understanding? Regards, Joarder Kamal On 27 June 2013 23:24, Shahab Yunus shahab.yu...@gmail.com wrote: I think you will need to update your hash function and redistribute data. As far as I know this has been on of the drawbacks

Re: How to specify the hbase.zookeeper.quorum on command line invoking hbase shell

2013-06-24 Thread Shahab Yunus
Have you tried creating your own small script in which you set the relevant environment variables per session (using 'export' for example)? On Mon, Jun 24, 2013 at 1:33 AM, Stephen Boesch java...@gmail.com wrote: We want to connect to a non-default / remote hbase server by setting

Re: How to specify the hbase.zookeeper.quorum on command line invoking hbase shell

2013-06-24 Thread Shahab Yunus
'hbase' does not seem to have --config/-config parameter. Regards, Shahab On Mon, Jun 24, 2013 at 8:39 AM, rajeshbabu chintaguntla rajeshbabu.chintagun...@huawei.com wrote: Can you try copying hbase-site.xml to other folder and change hbase.zookeeper.quorum to remote server and then use

  1   2   >