One thought to ponder:
If you are going to be splitting continuously and at a quicker pace, do you
have a strategy/plan to merge old regions? Otherwise, you can end up with a
cluster with proliferation of regions.
Regards,
Shahab
On Tue, Aug 18, 2015 at 3:55 PM, Shushant Arora
is to build something on top of FilterList?
-Original Message-
From: Shahab Yunus [mailto:shahab.yu...@gmail.com]
Sent: Monday, August 17, 2015 16:17
To: user@hbase.apache.org
Subject: Re: Hbase compound filter support
You can built nest Filter conditions using FilterLists
You can built nest Filter conditions using FilterLists of FilterList.
FilterList implements the Filter interface so 'add' method will work.
Have you checked that approach?
Regards,
Shahab
On Mon, Aug 17, 2015 at 4:07 PM, kannan.ramanat...@barclays.com wrote:
Hello,
Is there a Java API
My understanding is (please feel free to correct if I am wrong.):
For your first question, I think more than efficiency TableOutputFormat
provides you with a convenience of giving you a output format out of the
box which can do the Puts for you with default recommended config settings
like flush,
Ivan,
Is it possible that you poll for number of regions (e.g. in a loop) after
invoking split or merge to confirm that the action has been performed? I
know it is a crude way but maybe something can be done in these lines. Or
are you already doing this when you said 'look for Region Info'?
I am a very new here and also my contribution to the mailing list has been
limited as well. I am not even a committer. But I have been following and
reading the mailing list for a while. So given that, I am taking the
liberty and chiming in my 2 cents. I don't profess or claim to read other
The error is highlighting the issue.
You can't output List of Puts like this. Your reducer output is Mutation
and NOT a list of Mutation.
I have handled this scenario by defining my own base abstract class:
*public* *abstract* *class* TableReducerBatchPutsKEYIN, VALUEIN, KEYOUT
*extends*
of
TableMapReduceUtil.initTableReducerJob(inputTableName, Reduce.class,
job);
In fact, now is a Reduce Clas not a tableReducer, but i have the same error
Error: java.lang.ClassCastException: java.util.ArrayList cannot be cast to
org.apache.hadoop.hbase.client.Mutation
2015-05-19 15:14 GMT+02:00 Shahab Yunus
Lot of options depending upon your specifics of the usecase:
In addition to Hive...
You can use Sqoop
http://www.dummies.com/how-to/content/importing-data-into-hbase-with-sqoop.html
You can use Pig
Until you move to HBase 1.*, you should use HTableInterface. And the
autoFlush methods and semantics, as far as I understand are, same so you
should not have problem.
Regards,
Shahab
On Wed, May 13, 2015 at 11:09 AM, Serega Sheypak serega.shey...@gmail.com
wrote:
But HTable is deprecated in
In addition to Talat's comment about more info, you can check out the
following properties:
phoenix.query.queueSize
phoenix.query.timeoutMs
http://phoenix.apache.org/tuning.html
We have set these in the hbase-site-.xml on the client machine where the
Squirrel was running in the cases where the
You can specify the column family or column to read when you create the
Scan object. Have you tried that? Does it make sense? Or I misunderstood
your problem?
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#addColumn(byte[],%20byte[])
I see that you are already using partial key scan approach.
When you say that you want data from the timeStamp2 as well, you want all
the rows from timeStamp2? Also, how are you actually setting the stopRow?
What are you providing for the *anyNumber field when you are setting
stopRow?
e.g. if
For the null key you should use NullWritable class, as discussed here:
http://stackoverflow.com/questions/16198752/advantages-of-using-nullwritable-in-hadoop
Regards,
Shahab
On Mon, Apr 13, 2015 at 7:01 AM, Jean-Marc Spaggiari
jean-m...@spaggiari.org wrote:
Hi Silvio,
What is the key you
...
JM
2015-04-13 7:46 GMT-04:00 Shahab Yunus shahab.yu...@gmail.com:
For the null key you should use NullWritable class, as discussed here:
http://stackoverflow.com/questions/16198752/advantages-of-using-nullwritable-in-hadoop
Regards,
Shahab
On Mon, Apr 13, 2015 at 7
If you know the row key range of your data, then you can create splits
points yourself and then use HBase api to actually make the splits.
E.g. If you know that your row key (and it is a very contrived example) has
a range of A - Z then you can decide on split points as every 5 th letter
as your
Congrats an thanks to everyone involved. A big milestone! HBase *1.0*
Regards
Shahab
On Tue, Feb 24, 2015 at 2:24 PM, anil gupta anilgupt...@gmail.com wrote:
Kudos to HBase Team.
Read HA feature sounds exciting.
~Anil
On Tue, Feb 24, 2015 at 10:37 AM, Rajeshbabu Chintaguntla
many tables are there in your cluster ?
Is the cluster balanced overall (in terms of number of regions per server)
but this table is not ?
What happens (check master log) when you issue 'balancer' command through
shell ?
Cheers
On Fri, Feb 13, 2015 at 8:19 AM, Shahab Yunus shahab.yu
CDH 5.3
HBase 98.6
We are writing data to an HBase table through a M/R job. We pre split the
table before each job run. The problem is that most of the regions end up
on the same RS. This results in that one RS being severely overloaded and
subsequent M/R jobs failing trying to write to the
:
bq. all the regions of this table were back on this same RS!
Interesting. Please check master log around the time this RS was brought
online. You can pastebin the relevant snippet.
Thanks
On Fri, Feb 13, 2015 at 8:55 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
Hi Ted.
Yes
.
See if raising to 100 or 200 helps.
On Fri, Feb 13, 2015 at 1:09 PM, Shahab Yunus shahab.yu...@gmail.com
wrote:
Yes, this sever hosts other regions from other tables as well.
Regards
Shahab
On Fri, Feb 13, 2015 at 1:45 PM, Ted Yu yuzhih...@gmail.com wrote:
Interesting, server7
,60020,1423845018628 host regions from other table
?
Cheers
On Fri, Feb 13, 2015 at 10:27 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
Table name is:
MYTABLE_RECENT_4W_V2
Pastebin snippet 1: http://pastebin.com/dQzMhGyP
Pastebin snippet 2: http://pastebin.com/Y7ZsNAgF
This is the master log
If we programmatically split a table by async method in HBaseAdmin class
and even after waiting for quite a while, the split does not happen (there
is no difference in number of regions before and after the split call) and
there is no error or an exception either, does it mean that there is
.
St.Ack
On Wed, Feb 4, 2015 at 12:04 PM, Shahab Yunus shahab.yu...@gmail.com
wrote:
If we programmatically split a table by async method in HBaseAdmin class
and even after waiting for quite a while, the split does not happen
(there
is no difference in number of regions before and after
9 Lakh = 900,000
Regards,
Shahab
On Fri, Jan 23, 2015 at 11:44 AM, Ted Yu yuzhih...@gmail.com wrote:
Can you pastebin master / region server log around the time table was
disabled ?
BTW can you rephrase '9 lakh records' ? I don't know how many records that
is.
Cheers
On Thu, Jan 22,
You need to pre-split the table into regions as in this tool the number of
reducers are driven by the number of regions in the target table.
Read about it here:
http://hbase.apache.org/book/perf.writing.html
http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/
Regards,
Shahab
Yes you can. In fact in some of the vendor's distributions it comes with
the standard installation.
You can also use Hive and more elaborate, powerful but complex Phoenix.
Regards,
Shahab
On Mon, Dec 1, 2014 at 6:15 PM, Jignesh Patel jigneshmpa...@gmail.com
wrote:
can we use Apache
I think your best bet, to get the latest and accurate as possible data,
would be to directly contact the companies (through their Engineering
channels) which are known to host large clusters. Most of these companies
have public blogs and such so should not be hard to find an appropriate
contact.
, Shahab Yunus shahab.yu...@gmail.com
wrote:
Missed couple of things.
1- I am using SingleColumnValueFilter and the comparator
is BinaryComparator which is passed into it.
2- CDH 5.1.0
(Hbase is 0.98.1-cdh5.1.0)
Regards,
Shahab
On Tue, Nov 18, 2014 at 12:22 AM, Shahab Yunus
, 2014, at 8:06 AM, Shahab Yunus shahab.yu...@gmail.com wrote:
You mean if used independently? Yes, they do.
Regards,
Shahab
On Tue, Nov 18, 2014 at 10:51 AM, Ted Yu yuzhih...@gmail.com wrote:
Have you verified that at least one of the following (when used alone)
returns data
Hi,
I have data where each row has start and end time stored in UTC (long). The
table is created through Phoenix and the columns have type UNSIGNED_DATE
(which according to Phoenix docs
http://phoenix.apache.org/language/datatypes.html#unsigned_date_type does
Hbase.toBytes(long) underneath for 8
Missed couple of things.
1- I am using SingleColumnValueFilter and the comparator
is BinaryComparator which is passed into it.
2- CDH 5.1.0
(Hbase is 0.98.1-cdh5.1.0)
Regards,
Shahab
On Tue, Nov 18, 2014 at 12:22 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
Hi,
I have data where each row
The documentation of online merge tool (merge_region) states that if we
forcibly merge regions (by setting the 3rd attribute as true) then it can
create overlapping regions. if this happens then will this render the
region or table unusable or it is just a performance hit? I mean how bigger
of a
a look at master log around the time merge request was issued
to see if you can get some clue ?
Cheers
On Fri, Nov 14, 2014 at 6:41 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
The documentation of online merge tool (merge_region) states that if we
forcibly merge regions (by setting
FYI, Ted, I see this exact similar issue being discussed in the past here
as well:
http://mail-archives.apache.org/mod_mbox/hbase-user/201406.mbox/%3CCAKrkF=thi8g4Ks=viqgC+Y=ivuqysogoq41rmkutfriunal...@mail.gmail.com%3E
Regards,
Shahab
On Fri, Nov 14, 2014 at 11:35 AM, Shahab Yunus shahab.yu
in your master log:
LOG.error(Merged region + region.getRegionNameAsString()
+ has only one merge qualifier in META.);
It would be the case that 7373f75181c71eb5061a6673cee15931 still had
reference file.
Cheers
On Fri, Nov 14, 2014 at 8:35 AM, Shahab Yunus shahab.yu
Yu yuzhih...@gmail.com wrote:
One possibility was that region 7373f75181c71eb5061a6673cee15931 was
involved in some hbase snapshot.
Was the underlying table being snapshotted in recent past ?
Cheers
On Fri, Nov 14, 2014 at 9:05 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
Thanks again
I just checked. No snapshots were taken and 'list_snapshots' also returns
nothing.
Regards,
Shahab
On Fri, Nov 14, 2014 at 12:39 PM, Shahab Yunus shahab.yu...@gmail.com
wrote:
No. Not that I can recall but I can check.
From resolution perspective, is there any way we can resolve this. More
(
this.services.getConfiguration(), fs, tabledir, mergedRegion,
true
);
...
Then regionFs.hasReferences(htd) would tell you whether the underlying
region has reference files.
Cheers
On Fri, Nov 14, 2014 at 9:39 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
No. Not that I can recall
Yesterday, I believe.
Regards,
Shahab
On Fri, Nov 14, 2014 at 1:07 PM, Ted Yu yuzhih...@gmail.com wrote:
Shahab:
When was the last time compaction was run on this table ?
Cheers
On Fri, Nov 14, 2014 at 9:58 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
I see. Thanks
/Server.java#Server.getCatalogTracker%28%29(),
mergedRegion);
return true;
Do you think it is ok if we face this issue then we forcibly archive and
clean the regions ?
Regards,
Shahab
On Fri, Nov 14, 2014 at 1:10 PM, Shahab Yunus shahab.yu...@gmail.com
wrote:
Yesterday, I believe.
Regards
After major compacting the references were freed for the above mentioned
regions and then the merge_region command succeeded and they got merged.
Hmmm.
Regards,
Shahab
On Fri, Nov 14, 2014 at 2:08 PM, Shahab Yunus shahab.yu...@gmail.com
wrote:
Digging deeper into the code, I came across
that it is time
for major compaction.
Cheers
On Fri, Nov 14, 2014 at 11:31 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
After major compacting the references were freed for the above mentioned
regions and then the merge_region command succeeded and they got merged.
Hmmm.
Regards,
Shahab
compaction on selected region, see:
public void majorCompactRegion(final byte[] regionName)
Cheers
On Fri, Nov 14, 2014 at 11:49 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
I see. Thanks.
So we can in a way automate this resolution by invoking major compaction
programmatically
I think you have to make parallel multiple queries and combine the result
on client side. Something like this is doing in its implementation:
http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
Regards,
Shahab
On Thu, Nov 6,
need to re-import
all data again?
Thanks!
On Thursday, November 6, 2014, Shahab Yunus shahab.yu...@gmail.com
wrote:
I think you have to make parallel multiple queries and combine the result
on client side. Something like this is doing in its implementation:
http://blog.sematext.com
Have you tried setting the following property through the command line?
-D mapreduce.job.mappers
Regards,
Shahab
On Mon, Oct 20, 2014 at 2:24 AM, liub...@inspur.com liub...@inspur.com
wrote:
Hello,
I used hbase Rowcounter on yarn , but the num of mappers was 1, and the
progress was 0%.
For monotonically increasing data: can you try to do pre-splitting of the
destination table? That can help in avoiding one region getting overloaded
at the time of bulkimport.
Regards,
Shahab
On Fri, Sep 5, 2014 at 12:14 PM, Jianshi Huang jianshi.hu...@gmail.com
wrote:
Thanks Ted, I'll try to
I have a question here. In 0.98 the merge_region command which can be run
through HBase shell is not reliable? If we simply want to merge 2 regions
at a time? I thought that the older Merge tool was not safe.
Thanks,
Shahab
On Thu, Aug 28, 2014 at 2:26 PM, Bryan Beaudreault
-28 14:33 GMT-04:00 Shahab Yunus shahab.yu...@gmail.com:
I have a question here. In 0.98 the merge_region command which can be run
through HBase shell is not reliable? If we simply want to merge 2 regions
at a time? I thought that the older Merge tool was not safe.
Thanks,
Shahab
You don't need to initialize the tables.
You just need to specify the output format as MultipleTableOutputFormat
class.
Something like this:
job.setOutputFormatClass(MultipleTableOutputFormat.class);
Because if you see the code for MultipleTableOutputFormat, it creates the
table on the fly and
);
boolean b = job.waitForCompletion(true);
if (!b) {
throw new IOException(error with job!);
}
i am unable to figure out, what i am missing here,
-yeshwanth
On Wed, Aug 27, 2014 at 12:23 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
You don't need to initialize the tables
:
that mapreduce job reads data from hbase table,
it doesn't take any explicit input data/file/
-yeshwanth
On Wed, Aug 27, 2014 at 12:44 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
Where are you setting the input data/path/format for the job? I don't see
that in the code below
,
job);//otherArgs[0]=i1 TableMapReduceUtil.initTableReducerJob(otherArgs[0],
null, job);
Ted suggested to remove them,
if u see the first message in this thread, u will know the issue by
specifying the table.
-yeshwanth
On Wed, Aug 27, 2014 at 12:54 AM, Shahab Yunus shahab.yu...@gmail.com
19, 2014 at 7:00 PM, Ted Yu yuzhih...@gmail.com wrote:
My suggestion wasn't about pre-splitting.
You can insert dummy values as part of your proof-of-concept code -
before admin.split()
is called.
On Tue, Aug 19, 2014 at 3:50 PM, Shahab Yunus shahab.yu...@gmail.com
wrote:
Ted,
Hmmm
I have a table already created and with some data. I want to split it
trough code using HBaseAdmin api into multiple regions, while specifying
keys that do not exist in the table.
I am getting the exception below which makes sense because the key doesn't
exist yet. But at the time of creation of
...@spaggiari.org wrote:
Hi Shahab,
can you sahre your code? Seems that the RS you reached did not have the
expected region. How is your table status in the web interface?
JM
2014-08-19 16:11 GMT-04:00 Shahab Yunus shahab.yu...@gmail.com:
I have a table already created and with some data
whose keys correspond to the splits ?
Cheers
On Tue, Aug 19, 2014 at 1:29 PM, Shahab Yunus shahab.yu...@gmail.com
wrote:
So the situation here is that we are trying to bulk load data in to a
table. But each load of data has such range of keys that it will go to a
specific continuous chunk
I couldn't decide that whether it is an HBase question or Hadoop/Yarn.
In the utility class for MR jobs integerated with HBase,
*org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil, *
in the method:
*public static void initTableReducerJob(String table,*
*Class? extends TableReducer
You can use the util classes provided already. Note that it won't be very
fast and you might want to try out bulk import as well (especially if it is
one time or rare occurrence.) It depends on your use case. Check out the
documentation below:
For the Map Reduce Hbase util:
Parkirat,
This is a core Java concept which is mainly related to how Class
inheritance works in Java and how the @Override annotation is used, and is
not Hadoop specific. (It is also used while implementing interfaces since
JDK 6.)
You can read about it here:
Add @Override notation at top of the 'reduce' method and then try (just
like you are doing for the 'map' method):
public class WordCountReducer extends ReducerText, IntWritable, Text,
IntWritable {
*@Override*
protected void reduce(Text key, IterableIntWritable values,
There is a trick. You can use MultipleOutputs with TableMapReduceUtil. In
the Reducer you can write to desired outputs on HDFS using MultipleOutputs
and the HBase Util will do its work as is.
Only caveat is that, you will have to commit the files that you have
written using MultipleOutputs
Can you explain a bit that what issue you are facing?
Sqoop's documentation explains quite clearly how to import data from MySQL
to Hbase. You can use those commands in script to automate the process.
http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_selecting_the_data_to_import
Regards,
HIve and HBase are 2 different tools/technologies. They are used together
but hey are not interchangeable.
HIve is for on-demand, RDMS SQL like data access while HBase is the actual
data store. Hive runs on HBase providing a on-demand, SQL like API.
Regards,
Shahab
On Wed, Apr 30, 2014 at 4:34
It says:
RemoteException(java.io.IOException): /hbase/test is non empty
Is the directory empty or are there files form some previous runs? Does the
user have access to delete the data here?
Regards,
Shahab
On Tue, Mar 25, 2014 at 7:42 AM, Mohamed Ghareb m.ghar...@tedata.netwrote:
How I can
Also Matteo, just like distcp, one advantage of this (using webhdfs while
copying) could also be that even if the versions are not same, we can still
copy?
Regards,
Shahab
On Fri, Mar 21, 2014 at 8:14 AM, Matteo Bertozzi theo.berto...@gmail.comwrote:
ExportSnapshot uses the FileSystem API
so
http://www.slideshare.net/jaxlondon2012/hbase-advanced-lars-george
http://hortonworks.com/hadoop/hbase/
Regards,
Shahab
On Fri, Feb 28, 2014 at 8:36 AM, Vimal Jain vkj...@gmail.com wrote:
Hi,
Which one of the storage structure does hbase uses ? Is it LSM tree ,
SSTable or fractal tree ?
Also adding to what Ted mentioned, the following book has more details:
Take a lookat using webHDFS protocol to use distcp between clusters with
different versions:
On Mon, Oct 28, 2013 at 3:14 PM, S. Zhou myx...@yahoo.com wrote:
I need to copy data from Hadoop cluster A to cluster B. I know I can use
distCp tool to do that. Now the problem is: cluster A has
Sorry, last email was accidentally sent before I can finish it.
Take a look at using webHDFS protocol to use distcp between clusters with
different versions:
I'm only know of the links already embedded in the blog page that I sent
you or you have this.
https://groups.google.com/forum/#!forum/hbasewd
Regards,
Shahab
On Tue, Sep 24, 2013 at 11:12 AM, anil gupta anilgupt...@gmail.com wrote:
Inline
On Mon, Sep 23, 2013 at 6:15 PM, Shahab Yunus
From where are you running your job? From which machine? This client
machine from where you are kicking of this job should have the
hbase-site.xml with the correct ZK info in it. It seems that your
client/job is having and issue picking up the right ZK, rather than the
services running on your
http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
Here you can find the discussion, trade-offs and working code/API (even for
M/R) about this and the approach you are trying out.
Regards,
Shahab
On Mon, Sep 23, 2013 at 5:41
:51 PM, Shahab Yunus shahab.yu...@gmail.com
wrote:
http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
Here you can find the discussion, trade-offs and working code/API (even
for
M/R) about this and the approach you
Some quick thoughts, well your size is bound to increase because recall
that the rowkey is stored in every cell. So when in CSV if you have let us
say 5 columns and when you imported them to HBASE using the first column as
key, then you will end up with essentially 9 (1 for the rowkey and then 2
context)
So the type parameters above should facilitate this.
Take a look at the PutCombiner from HBase source code:
public class PutCombinerK extends ReducerK, Put, K, Put {
Cheers
On Thu, Sep 5, 2013 at 9:46 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
Ted,
Might be a something
Try using Bytes.toBytes(your string) rather than String.getBytes.
Regards,
Shahab
On Thu, Sep 5, 2013 at 2:16 AM, Omkar Joshi omkar.jo...@lntinfotech.comwrote:
I'm trying to execute a MR code over stand-alone HBase(0.94.11). I had
read the HBase api and modified my MR code to read data and
Your read queries seem to be more driven form the 'action' and 'object'
perspective, rather than user.
1- So one option is that you make a composite key with action and object:
action|object and the columns are users who are generating events on this
combination. You can scan using prefix filter
AM, Shahab Yunus shahab.yu...@gmail.com
wrote:
My 2 cents:
1- Yes, that is one way to do it. You can also use fixed length for every
attribute participating in the composite key. HBase scan would be more
fitting to this pattern as well, I believe (?) It's a trade-off basically
between
Ted,
Might be a something very basic that I am missing but why should OP's
reducer's key be of type ImmutableBytesWritable if he is emitting Text in
the mapper? Thanks.
protected void map(
ImmutableBytesWritable key,
Result value,
This maybe a newbie or dumb question but I believe, this does not affect or
apply to HBase distributions by other vendors like HortonWorks or Cloudera.
If someone is using one of the versions of distributions provided by them
then it is up to them (and not people and community here) what and till
What advantage you will be gaining by compressing? Less space? But then it
will add compression/decompression performance overhead. A trade-off but a
especially significant as space is cheap and redundancy is OK with such
data stores.
Having said that, more importantly, what are your read
the keys as NullWritable or LongWritable i.e. by
keeping the same types of keys, I am getting the same error.
I don't think the error is at Map Input side. Its saying value from map.
Can't understand where I am going wrong.
Regards
Praveenesh
On Thu, Aug 29, 2013 at 4:58 PM, Shahab Yunus
Exactly I had the same though as Ashwanth too, that is why I asked whether
@Override annotation is being used or not.
Regards,
Shahab
On Thu, Aug 29, 2013 at 1:09 PM, Ashwanth Kumar
ashwanthku...@googlemail.com wrote:
Hey Praveenesh, I am not sure if this would help.
But can you try moving
Taking what Ravi Kiran mentioned a level higher, you can also use Pig. It
has DBStorage. Very easy to rad from HBase and dump to MySQL if your data
porting does not require complex transformation (even which can be handled
in Pig too.)
*do i understand it correctly that when i create lots of tables, but they
all use the same column family (by name), that i am just using one column
**family *and i am OK with respect to limiting number of column families ?
I don't think so. Column families are per table. Even if the name of the
Can you please explain or show the flow of the code a bit more? Why are you
create the HTable object again and again in the mapper? Where is
ContentidxTable
(the name of the table, I believe?) defined? What is your actually
requirement?
Also, have you looked into this, the api for wiring HBase
, Aug 19, 2013 at 7:05 PM, Shahab Yunus shahab.yu...@gmail.com
wrote:
Can you please explain or show the flow of the code a bit more? Why are
you
create the HTable object again and again in the mapper? Where is
ContentidxTable
(the name of the table, I believe?) defined? What is your
As for your second question, I think, the hbase-site.xml file with default
options is placed in the hbase jar files/libs on RS. Someone can correct me
if I am wrong.
Regards,
Shahab
On Thu, Aug 8, 2013 at 6:30 PM, Kim Chew kchew...@gmail.com wrote:
Hello there,
As titled. Also I would
Though it is better as Ted suggested to discuss this in Sqoop mailing list
(as Sqoop 2 supposed to be more feature rich) but just to get this out,
Sqoop does support incremental imports if you can come up with a suitable
and compatible strategy. Tha tmigh thelp you if you configure you imports
on
Please correct me if I am wrong but I think there is as such no hard and
fast technique for it. There are no constructs or method to this
specifically in HBase. Your client while writing have to make sure to write
to both tables: 1) the main table 2) and the secondary index table.
Basically it is
is more of a temporary table which I can just do
a get operation and get the value?
Am I right?
Regards,
Pavan
On Jul 31, 2013 7:24 PM, Shahab Yunus shahab.yu...@gmail.com wrote:
Please correct me if I am wrong but I think there is as such no hard and
fast technique
wrote:
If i have the value of a row in JSON format, would pig we able to
parse it and join the fields as per my needs?
On Fri, Jul 19, 2013 at 10:00 PM, Shahab Yunus
shahab.yu...@gmail.comjavascript:;
wrote:
You can also look into Pig, if you already haven't. It supports various
kinds
You can also look into Pig, if you already haven't. It supports various
kinds of joins and is simpler than writing your own M/R job (assuming that
you don't have complex or custom requirements.)
Regards,
Shahab
On Fri, Jul 19, 2013 at 12:24 AM, Pavan Sudheendra pavan0...@gmail.comwrote:
Hi,
Not saying this is a solution or better in anyway but just more food for
thought. Is there any maximum size limit for UserIds? You can pad also for
Users Ids of smaller length. You are using more space in this way though.
It can help in sorting as well.
Regards,
Shahab
On Mon, Jul 8, 2013 at
I think you will need to update your hash function and redistribute data.
As far as I know this has been on of the drawbacks of this approach (and
the SemaText library)
Regards,
Shahab
On Wed, Jun 26, 2013 at 7:24 PM, Joarder KAMAL joard...@gmail.com wrote:
May be a simple question to answer
to guide me through any reference which can confirm this
understanding?
Regards,
Joarder Kamal
On 27 June 2013 23:24, Shahab Yunus shahab.yu...@gmail.com wrote:
I think you will need to update your hash function and redistribute data.
As far as I know this has been on of the drawbacks
Have you tried creating your own small script in which you set the relevant
environment variables per session (using 'export' for example)?
On Mon, Jun 24, 2013 at 1:33 AM, Stephen Boesch java...@gmail.com wrote:
We want to connect to a non-default / remote hbase server by setting
'hbase' does not seem to have --config/-config parameter.
Regards,
Shahab
On Mon, Jun 24, 2013 at 8:39 AM, rajeshbabu chintaguntla
rajeshbabu.chintagun...@huawei.com wrote:
Can you try copying hbase-site.xml to other folder and change
hbase.zookeeper.quorum to remote server and then use
1 - 100 of 121 matches
Mail list logo