ck to the main
> working for.
> Sent from mobile excuse any typos.
> On Jul 27, 2014 10:07 AM, "Anoop John" wrote:
>
>> As per Shankar he can get things work with below configs
>>
>>
>> hbase.regionserver.hlog.reader.impl
>
ot moved under corrupt logs
is a concerning thing. Need to look at that.
>
> Agreed.
>
>
>> On Jul 27, 2014, at 1:07 AM, Anoop John wrote:
>>
>> As per Shankar he can get things work with below configs
>>
>>
>>hbase.regionser
It will be the key of the KeyValue. Key includes
rk + cf + qualifier + ts + type.
So all these part of key. Your annswer#1 is correct (but with addition of
type also).. Hope this make it clear for you.
-Anoop-
On Tue, Aug 5, 2014 at 9:43 AM, innowireless TaeYun Kim <
taeyun@innowireless.c
Shankar
You are getting this only when the HFile encryption is enabled? Seeing
the exception dont think it is directly related to Encryption or so.
Suggest test with out encryption multiple times and see any time you get
the same in that also
-Anoop-
On Fri, Aug 8, 2014 at 11:29 AM, Esteban
Running the IT IntegrationTestIngestWithVisibilityLabels fails. This is
because we are not handling the Deletes in
LoadTestDataGeneratorWithVisibilityLabels.
As it is an issue only with IT test and no code level issues, you can take
a call Andy.I have raised HBASE-11716 and attached a simple
to fail an RC.
>
> Regards
> Ram
>
>
> On Mon, Aug 11, 2014 at 10:50 AM, Anoop John
> wrote:
>
> > Running the IT IntegrationTestIngestWithVisibilityLabels fails. This
> is
> > because we are not handling the Deletes in
> > LoadTestDataGeneratorWithVisib
What about your KV size and HFile block size for the table. For a random
read type of use case a lower value for HFile block size might help.
-Anoop-
On Fri, Aug 15, 2014 at 1:56 AM, Esteban Gutierrez
wrote:
> If not set in hbase-site.xml both tcpnodelay and tcpkeepalive are set to
> true (th
Pls have a look at HFileBlock#heapSize()
You can know the overhead reading this.
-Anoop-
On Fri, Aug 15, 2014 at 1:50 AM, Nick Dimiduk wrote:
> I'm not aware of specifically this experiment. You might have a look at our
> HeapSize interface and it's implementations for things like HFileBlock.
Great work! Thanks a lot Misty...
-Anoop-
On Wed, Aug 20, 2014 at 11:56 AM, ramkrishna vasudevan <
ramkrishna.s.vasude...@gmail.com> wrote:
> Great job !! Keep it up.!!!
>
> Regards
> Ram
>
>
> On Wed, Aug 20, 2014 at 11:49 AM, rajeshbabu chintaguntla <
> rajeshbabu.chintagun...@huawei.com> wr
>Is it possible that the put method call on Htable does not actually put
the record in the database while also not throwing an exception?
You can. Implement a region CP (implementing RegionObserver) and implement
prePut() . In this u can bypass the operation using ObserverContext#bypass().
So cor
And u have to implement
transformCell(*final* Cell v)
in your custom Filter.
JFYI
-Anoop-
On Fri, Sep 12, 2014 at 4:36 AM, Nishanth S wrote:
> Sure Sean.This is much needed.
>
> -Nishan
>
> On Thu, Sep 11, 2014 at 3:57 PM, Sean Busbey wrote:
>
> > I filed HBASE-11950 to get some details adde
Again full code snippet can better speak.
But not getting what u r doing with below code
private List generatePartitions() {
List regionScanners = new ArrayList();
byte[] startKey;
byte[] stopKey;
HConnection connection = null;
HBaseAdmin hbaseAdmin = null;
You have more than one increment for the same key in one batch?
On Wed, Sep 17, 2014 at 12:33 PM, Vinay Gupta
wrote:
> Also the regionserver keeps throwing exceptions like
>
> 2014-09-17 06:56:07,151 DEBUG [RpcServer.handler=10,port=60020]
> regionserver.ServerNonceManager: Conflict detected by
s.
-Anoop-
On Wed, Sep 17, 2014 at 1:04 PM, Vin Gup
wrote:
> Yes possibly. Why would that be a problem?
> Earlier client (0.94) didn't complain about it.
>
> Thanks,
> -Vinay
>
> > On Sep 17, 2014, at 12:16 AM, Anoop John wrote:
> >
> > You have mo
his error even with
> batches with no row key duplicates. I still suspect that client is timing
> out and retrying too often and needs to back off as the region server is
> heavily loaded.
>
> -Vinay
>
> > On Sep 17, 2014, at 3:14 AM, Anoop John wrote:
> >
> > This
Hi
Even when the RS throws this Exception, the client side will start a new
Scanner and retry. U just see this in log or the scan is failing altogether?
What is the caching you use on Scan? When most of the rows are filtered
out at server side, it takes more time to fetch and return the 'cachin
I receive this error in client side, and pretty sure the scan failed.
> > I'm using default caching, so it should be 100, right?
> > About scan time out period, I will try to set it higher, probably 1 hour.
> >
> > BTW, I'm using hbase 0.96.0.
> >
> > Bes
You have ~280 regions per RS.
And ur memstore size % is 40% and heap size 48GB
This mean the heap size for memstore is 48 * 0.4 = 19.2GB ( I am just
considering the upper water mark alone)
If u have to consider all 280 regions each with 512 MB heap you need much
more size of heap. And your writ
So you want one version with ts<= give ts?
Have a look at Scan#setTimeRange(long minStamp, long maxStamp)
If you know the exact ts for cells, you can use Scan#setTimeStamp(long
timestamp)
-Anoop-
On Wed, Nov 12, 2014 at 11:17 AM, Krishna Kalyan
wrote:
> For Example for table 'test_table', Valu
If you want the delete and new row put in a single transaction, (well that
is the best thing to do) you can try using mutateRow(final RowMutations rm)
Add a delete row mutation followed by a Put
You should be careful ab the timestamp of 2 Mutations. You should provide
ts from client side.
May b
byte[] email_b=Bytes.toBytes(mail);//column qulifier
byte[] colmnfamily=Bytes.toBytes("colmn_fam");//column family
Scan scan_col=new Scan (Bytes.toBytes("colmn_fam"),email_b);
Scan constructor is taking start and stop rows (rks). You seem to pass a cf
and q names.
Scan s = new Sca
You can pre split the table as per the key ranges and use a custom Load
Balancer to keep the regions to required nodes (?) Seems you have to
collocate 2 table regions in these nodes (to do the join)... So hope you
already work with the LB
-Anoop-
On Wed, Apr 8, 2015 at 8:17 AM, Alok Singh wrot
bq.while the region can surely split when more data added-on, but can HBase
keep the new regions still on the same regionServer according to the
predefined bounary?
You need custom LB for that.. If there, it is possible to restrict
-Anoop-
On Thu, Apr 9, 2015 at 12:09 AM, Demai Ni wrote:
> hi
If u want data of timeStamp2 also, (All 6 rows as shown in eg: above) then
you have to put timeStamp3 in stop row.. The stop row is exclusive.
startRow : aabb|timeStamp1|
stopRow is : aabb|timeStamp3
-Anoop-
On Wed, Apr 22, 2015 at 5:09 PM, Shahab Yunus
wrote:
> I see that you are already
Yes the KVs coming out from your delegate Scanner will be in sorted form..
Also with all other logic applied like removing TTL expired data, handling
max versions etc.. Thanks for updating..
-Anoop-
On Sat, Oct 20, 2012 at 1:11 AM, PG wrote:
> Hi, Anoop and Ram,
> As I have coded the idea, th
Hi
Using ImportTSV tool you are trying to bulk load your data. Can you see
and tell how many mappers and reducers were there. Out of total time what
is the time taken by the mapper phase and by the reducer phase. Seems like
MR related issue (may be some conf issue). In this bulk load case most
How many rows want to get within CP? What is the time taken now?
Do you enable block caching? Pls observe the cache hit ratio.
As you issue the Get at the server side only(CP), one by one get is the
only way.
Also how many CFs and how many HFiles for each of the CF? Blooms have you
tried?
-Anoo
create
> > HFiles directly.
> >
> > Regards
> > Ram
> >
> > On Wed, Oct 24, 2012 at 8:59 AM, Anoop John
> wrote:
> >
> > > Hi
> > > Using ImportTSV tool you are trying to bulk load your data. Can you
> > see
> > >
RS
will be used.. So there is no point in WAL there... I am making it clear
for you? The data is already present in form of raw data in some txt or
csv file :)
-Anoop-
On Wed, Oct 24, 2012 at 10:41 AM, Anoop John wrote:
> Hi Anil
>
>
>
> On Wed, Oct 24, 2012 at 10:39 AM, a
27;s a very interesting fact. You made it clear but my custom Bulk Loader
> generates an unique ID for every row in map phase. So, all my data is not
> in csv or text. Is there a way that i can explicitly turn on WAL for bulk
> loading?
>
> On Tue, Oct 23, 2012 at 10:14 PM, Anoop Jo
n that mapper again on the same data
> set.. Then the unique id will be different?
>
> Anil: Yes, for the same dataset also the UniqueId will be different.
> UniqueID does not depends on the data.
>
> Thanks,
> Anil Gupta
>
> On Tue, Oct 23, 2012 at 11:07 PM, Anoop John
>
>What I still don’t understand is, since both CP and MR are both
>running on the region side, with is the MR better than the CP?
For the case bulk delete alone CP (Endpoint) will be better than MR for
sure.. Considering your over all need people were suggesting better MR..
U need a scan and move s
You have one CF such that all rows will have KVs for that CF?
You need to implement your own filter.
Your scan can select the above CF and the on which u need the filtering.
Have a look at the QualifierFilter.. similar approach you might need to do
in the new filter.. Good luck :)
-Anoop-
On Thu
Ram, This issue was for prePut()..postPut() was fine only...
Can you take a look that at the time of slow put what the corresponding RS
threads doing.
May be can get some clues from that.
-Anoop-
On Fri, Nov 30, 2012 at 2:04 PM, ramkrishna vasudevan <
ramkrishna.s.vasude...@gmail.com> wrote
ramkrishna vasudevan
> wrote:
> > Ok...fine...Ya seeing what is happening in postPut should give an idea.
> >
> > Regards
> > Ram
> >
> > On Sat, Dec 1, 2012 at 1:52 PM, Anoop John
> wrote:
> >
> >> Ram, This issue was for prePut()..po
Hi Manoj
Can u tell more abt your use case.. You know the rowkey
range which needs to be deleted? (all the rowkeys) Or is it like based on
some condition you want to delete a set of rows? Which version of HBase
you are using?
HBASE-6284 provided some performance improvement in c
In that case the CP hook might need to make an RPC call.. Might be to
another RS?
In this case why cant you think of doing both the table updation from the
client side? Sorry I am not fully sure abt your use case
-Anoop-
On Wed, Dec 5, 2012 at 11:00 PM, Amit Sela wrote:
> And if they are not i
>Can I load file rows to hbase table without importing to hdfs
Where you want the data to get stored finally..I mean the raw data.. I hope
that in HDFS only ( u want)
Have a look at ImportTSV tool..
-Anoop-
On Thu, Dec 13, 2012 at 9:23 PM, Mehmet Simsek wrote:
> Can I load file rows to hbase t
>how the massive number of get() is going to
perform againt the main table
Didnt follow u completely here. There wont be any get() happening.. As the
exact rowkey in a region we get from the index table, we can seek to the
exact position and return that row.
-Anoop-
On Thu, Dec 27, 2012 at 6:37
Hi
Can you check with using API HTable#batch()? Here you can batch a
number of increments for many rows in just one RPC call. Might help you to
reduce the net time taken. Good luck.
-Anoop-
On Sat, Jan 12, 2013 at 4:07 PM, kiran wrote:
> Hi,
>
> My usecase is I need to increment 1 milli
Hi
Can you think of using HFileOutputFormat ? Here you use
TableOutputFormat now. There will be put calls to HTable. Instead in
HFileOutput format the MR will write the HFiles directly.[No flushes ,
compactions] Later using LoadIncrementalHFiles need to load the HFiles to
the regions.
t; > >
> > > Thanks
> > >
> > > On Sat, Jan 12, 2013 at 10:57 AM, Asaf Mesika
> > > wrote:
> > >
> > > > Most time is spent reading from Store file and not on network
> transfer
> > > time
> > > > of Increment obje
HBase throw this Exception when the connection btw the client process and
the RS process (connection through which the op request came) is broken.
Any issues with your client app or nw? The operation will be getting
retried from client right?
-Anoop-
On Sun, Jan 13, 2013 at 8:24 PM, Ted Yu wrot
In your CP methods you will get ObserverContext object from which you can
get HRS object.
ObserverContext.getEnvironment().getRegionServerServices()
>From this HRS you can get hold to any of the region served by that RS.
Then directly call methods on HRegion to insert data. :)
Good luck..
-Anoop-
At a read time, if there are more than one HFile for a store, HBase will
read that row from all the HFiles (check whether this row is there and if
so read) and also from memstore. So it can get the latest data.
Also remember that there will be compaction happening for HFiles which will
merge more
In case of Hive data insertion means placing the file under table path in
HDFS. HBase need to read the data and convert it into its format. (HFiles)
MR is doing this work.. So this makes it clear that HBase will be slower.
:) As Michael said the read operation...
-Anoop-
On Thu, Jan 17, 2013
>lets say for a scan setCaching is
10 and scan is done across two regions. 9 Results(satisfying the filter)
are in Region1 and 10 Results(satisfying the filter) are in Region2. Then
will this scan return 19 (9+10) results?
@Anil.
No it will return 10 results only not 19. The client here takes into
What is this filtering at client side doing exactly?
postScannerClose() wont deal with any scanned data. This hook will be
called later.. You should be using hooks with scanner's next calls. Mind
telling the exact thing you are doing now at client side. Then we might be
able to suggest some thin
ng on
> the
> >> server side then maybe an EndPoint coprocessor would be more fitting.
> >> You can iterate over the InternalScanner and return a Map<> with your
> >> filtered values.
> >>
> >> You can also check this link:
> >>
> h
HBase data is ultimately persisted in HDFS and there it will be replicated
in different nodes. But HBase table's each region will be associated with
exactly one RS. So doing any operation on that region, ant client need to
contact this HRS only.
-Anoop-
On Sat, Feb 16, 2013 at 2:07 AM, Pamecha, A
Is this really related to concurrent reads? I think some thing else..
Will dig into code tomorrow. Can you attach a junit test case which will
produce NPE.
-Anoop-
On Sat, Mar 2, 2013 at 9:29 PM, Ted Yu wrote:
> Looks like the issue might be related to HTable:
>
> at org.apache.had
Matt Corgan
I remember, some one else also sent mail some days back
looking for same use case
Yes CP can help. May be do deletion of duplicates at Major compact time?
-Anoop-
On Sun, Mar 3, 2013 at 9:12 AM, Matt Corgan wrote:
> I have a few use cases where I'd like to leverage
The guide explains it well.. The region moves across RSs and splits of
region will cause the location cache(at client) to be stale and it will
look into META again. The memory flush/compaction and all will not make it
to happen. When there is a change for the META entry for a region, the
location
When you say column, you mean one column family (CF) or column qualifier?
If this is one column qualifier and there are other qualifiers in the same
CF?
-Anoop-
On Sun, Mar 10, 2013 at 12:41 AM, yun peng wrote:
> Hi, All,
> I want to find all existing values for a given column in a HBase, and w
As per the above said, you will need a full table scan on that CF.
As Ted said, consider having a look at your schema design.
-Anoop-
On Sun, Mar 10, 2013 at 8:10 PM, Ted Yu wrote:
> bq. physically column family should be able to perform efficiently (storage
> layer
>
> When you scan a row, da
How many regions per RS? And CF in table?
What is the -Xmx for RS process? You will bget 35% of that memory for all
the memstores in the RS.
hbase.hregion.memstore.flush.size = 1GB!!
Can you closely observe the flushQ size and compactionQ size? You may be
getting so many small file flushes(Due t
Agree here. The effectiveness depends on what % of data satisfies the
condition, how it is distributed across HFile blocks. We will get
performance gain when the we will be able to skip some HFile blocks (from
non essential CFs). Can test with different HFile block size (lower value)?
-Anoop-
On
You can use MultiRowMutationEndpoint for atomic op on multiple rows (within
same region)..
On Sun, Apr 21, 2013 at 5:55 AM, Ted Yu wrote:
> Here is code from 0.94 code base:
>
> public void mutateRow(final RowMutations rm) throws IOException {
> new ServerCallable(connection, tableName, r
Hi
How many request handlers are there in ur RS? Can you up this
number and see?
-Anoop-
On Wed, Apr 24, 2013 at 3:42 PM, kzurek wrote:
> The problem is that when I'm putting my data (multithreaded client, ~30MB/s
> traffic outgoing) into the cluster the load is equally spread over a
>But it seems that I'm losing writes somewhere, is it possible the writes
could fail silently
Which version you are using? How you say writes missed silently? The
current read, which was going on, has not returned the row that you just
wrote? Or you have created a new scan after wards and in th
You are making use of batch Gets? get(List)
-Anoop-
On Tue, Apr 30, 2013 at 11:40 AM, Viral Bajaria wrote:
> Thanks for getting back, Ted. I totally understand other priorities and
> will wait for some feedback. I am adding some more info to this post to
> allow better diagnosing of performance
Apr 30, 2013 at 12:12 PM, Viral Bajaria wrote:
> I am using asynchbase which does not have the notion of batch gets. It
> allows you to batch at a rowkey level in a single get request.
>
> -Viral
>
> On Mon, Apr 29, 2013 at 11:29 PM, Anoop John
> wrote:
>
> > You a
Navis
Thanks for the issue link. Currently the read queries will start MR
jobs as usual for reading from HBase. Correct? Is there any plan for
supporting noMR?
-Anoop-
On Thu, May 2, 2013 at 7:09 AM, Navis류승우 wrote:
> Currently, hive storage handler reads rows one by one.
>
> https://
>data in one common variable
Didn't follow u completely. Can u tell us little more on your usage. How
exactly the endpoint to be related with the CP hook (u said postPut)
-Anoop-
On Fri, May 3, 2013 at 4:04 PM, Pavel Hančar wrote:
> Hello,
> I've just started to discover coprocessors. Namely t
I have just gone through the code and trying answering ur questions.
>From what I currently understand, this operation allows me to append bytes
to an existing cell.
Yes
>Does this append by creating a new cell with a new timestamp?
Yes
>Does this update the cell while maintaining its timestamp?
No
> Current pos = 32651;
currKeyLen = 45; currValLen = 80; block limit = 32775
This means after the cur position we need to have atleast 45+80+4(key
length stored as 4 bytes) +4(value length 4 bytes)
So atleast 32784 should have been the limit. If we have memstoreTS also
written with this KV some
tables,
> does something make troubles?
>
>
> 2013/5/13 Anoop John
>
> > > Current pos = 32651;
> > currKeyLen = 45; currValLen = 80; block limit = 32775
> >
> > This means after the cur position we need to have atleast 45+80+4(key
> > length stored as 4 b
Praveen,
How many regions there in ur table and how and CFs?
Under /hbase/ there will be many files and dir u will be able
to see. There will be .tableinfo file and every region will have
.regionInfo file and then under cf the data file (HFiles) . Your total
data is 250GB. When your block size is
>now have 731 regions (each about ~350 mb !!). I checked the
configuration in CM, and the value for hbase.hregion.max.filesize is 1 GB
too !!!
You mentioned the splits at the time of table creation? How u created the
table?
-Anoop-
On Mon, May 13, 2013 at 5:18 PM, Praveen Bysani wrote:
> Hi,
re not providing any details in the configuration object ,
> except for the zookeeper quorum, port number. Should we specify explicitly
> at this stage ?
>
> On 13 May 2013 19:54, Anoop John wrote:
>
> > >now have 731 regions (each about ~350 mb !!). I checked the
> > co
>Yes bloom filters have been enabled: ROWCOL
Can u try with ROW bloom?
-Anoop-
On Fri, May 17, 2013 at 12:20 PM, Viral Bajaria wrote:
> Thanks for all the help in advance!
>
> Answers inline..
>
> Hi Viral,
> >
> > some questions:
> >
> >
> > Are you adding new data or deleting data over time?
>So in BlockCache, does HBase store b1 and b2 separately, or store the
merged form?
store b1 and b2 separately.. Stores the blocks read from HFiles.
-Anoop-
On Mon, May 20, 2013 at 5:37 PM, yun peng wrote:
> Hi, All,
> I am wondering what is exactly stored in BlockCache: Is it the same raw
>
There is an index for the blocks in a HFile. This index contains details
like start row in the block, its offset and length in the HFile... So as
a 1st setp to get a rowkey, we will find this rk can be present in which
HFile block.. (I am assuming only one HFile as of now).. Now we will see
whe
vel of index (like meta data index) here in memory? if
> there
> is, is it a hash index or other?...
>
> Regards
> Yun
>
>
> On Wed, May 29, 2013 at 7:45 AM, Anoop John wrote:
>
> > There is an index for the blocks in a HFile. This index contains details
> > like s
Can you have a look at issue HBASE-8476? Seems related? A fix is
available in HBASE-8346's patch..
-Anoop-
On Thu, May 30, 2013 at 9:21 AM, Kireet wrote:
> We are running hbase 0.94.6 in a concurrent environment and we are seeing
> the majority of our code stuck in this method at the synchron
> 0.96 will support HBase RPC compression
Yes
> Replication between master and slave
will enjoy it as well (important since bandwidth between geographically
distant data centers is scarce and more expensive)
But I can not see it is being utilized in replication. May be we can do
improvements in t
Yes the replication can be specified at the CF level.. You have used
HCD#setScope() right?
> S => '3', BLOCKSIZE => '65536'}, {*NAME => 'cf2', REPLICATION_SCOPE =>
'2'*,
You set scope as 2?? You have to set one CF to be replicated to one cluster
and another to to another cluster. I dont think it
>4. This one is related to what I read in the HBase definitive guide
bloom filter section
Given a random row key you are looking for, it is very likely that this
key will fall in between two block start keys. The only way for HBase to
figure out if the key actually exists is by loading
When you set time range on Scan, some files can get skipped based on the
max min ts values in that file. Said this, when u do major compact and do
scan based on time range, dont think u will get some advantage.
-Anoop-
On Wed, Jun 5, 2013 at 10:11 AM, Rahul Ravindran wrote:
> Our row-keys do
Why there are so many miss for the index blocks? WHat is the block cache
mem you use?
On Wed, Jun 5, 2013 at 12:37 PM, ramkrishna vasudevan <
ramkrishna.s.vasude...@gmail.com> wrote:
> I get your point Pankaj.
> Going thro the code to confirm it
> // Data index. We also read statistics about
gt;
>
> On Wed, Jun 5, 2013 at 12:37 AM, Anoop John wrote:
>
> > Yes the replication can be specified at the CF level.. You have used
> > HCD#setScope() right?
> >
> > > S => '3', BLOCKSIZE => '65536'}, {*NAME => 'cf2',
How many total RS in the cluster? You mean u can not do any operation on
other regions in the live clusters? It should not happen.. Is it so
happening that the client ops are targetted at the regions which were in
the dead RS( and in transition now)? Can u have a closer look and see?
If not pl
e
> of scan for finding a key in a block. I feel that warming up the block and
> index cache could be a useful feature for many workflows. Would it be a
> good idea to have a JIRA for that?
>
> Thanks,
> Pankaj
>
>
> On Wed, Jun 5, 2013 at 1:24 AM, Anoop John wrote:
>
&
You want to have an index per every CF+CQ right? You want to maintain diff
tables for diff columns?
Put is having getFamilyMap method Map CF vs List KVs. From this List of
KVs you can get all the CQ names and values etc..
-Anoop-
On Sat, Jun 8, 2013 at 11:24 PM, rob mancuso wrote:
> Hi,
>
>
Shixiaolong,
You would like to contribute your work to open src? If so
mind raising a jira and attaching a solution doc for all of us?
-Anoop-
On Thu, Jun 6, 2013 at 10:31 AM, Ted Yu wrote:
> HBASE-7404 Bucket Cache has done some work in this regard.
>
> Please refer to the late
When adding data to HBase with same key, it is the timestamp (ts) which
determines the version. Diff ts will make diff versions for the cell. But
in case of bulk load using ImportTSV tool, the ts used by one mapper will
be same. All the Puts created from it will have the same ts. The tool
allows us
You can specify max size to indicate the region split (when a region should
get split) But this size is the size of the HFile. To be precise it is the
size of the biggest HFile under that region. If u specify this size as 10G
and when the region is having a file of size bigger than 10G the region
w
Have a look at FuzzyRowFilter
-Anoop-
On Sat, Jun 22, 2013 at 9:20 AM, Tony Dean wrote:
> I understand more, but have additional questions about the internals...
>
> So, in this example I have 6000 rows X 40 columns in this table. In this
> test my startRow and stopRow do not narrow the scan c
>the flush size is at 128m and there is no memory pressure
You mean there is enough memstore reserved heap in the RS, so that there
wont be premature flushes because of global heap pressure? What is the RS
max mem and how many regions and CFs in each? Can you check whether the
flushes happening b
The config "hbase.regionserver.maxlogs" specifies what is the max #logs and
defaults to 32. But remember if there are so many log files to replay then
the MTTR will become more (RS down case )
-Anoop-
On Thu, Jun 27, 2013 at 1:59 PM, Viral Bajaria wrote:
> Thanks Liang!
>
> Found the logs. I had
Viral,
Basically when you increase the memstore flush size ( well ur aim
there is to reduce flushes and make data sit in memory for longer time) you
need to carefully consider the 2 things
1. What is the max heap and what is the % memory you have allocated max for
all the memstores in a RS. An
> so i can not use default scan() constructor as it will scan whole
table in one go which results in OutOfMemory error in client process
Not getting what you mean by this. Client calls next() on the Scanner and
gets the rows. The setCaching() and setBatch() determines how much of data
(rows, cells
When you make the RK and convert the int parts into byte[] ( Use
org.apache.hadoop.hbase.util.Bytes#toBytes(*int) *) it will give 4 bytes
for every byte.. Be careful about the ordering... When u convert a +ve
and -ve integer into byte[] and u do Lexiographical compare (as done in
HBase) u will
It is not supported from shell. Not directly from delete API also..
You can have a look at BulkDeleteEndpoint which can do what you want to
-Anoop-
On Thu, Jul 4, 2013 at 4:09 PM, yonghu wrote:
> I check the latest api of Delete class. I am afraid you have to do it by
> yourself.
>
> regards!
wrote:
> Hi Anoop
> one more question. Can I use BulkDeleteEndpoint at the client side or
> should I use it like coprocessor which deployed in the server side?
>
> Thanks!
>
> Yong
>
>
> On Thu, Jul 4, 2013 at 12:50 PM, Anoop John wrote:
>
> > It is not s
checksumFailures will get updated when the HBase handled checksum feature
is in use and checksum check done at RS side failed.. If it happens we
will try to read from DN with DN checksum check enabled.
Agree that right now the HBase handled checksum will work only with SCR.
But it might work wit
Viral
DFS client uses org.apache.hadoop.hdfs.BlockReaderLocal for
SCR.. I can see some debug level logs in this
*
LOG*.debug("New BlockReaderLocal for file " + blkfile + " of size "
+ blkfile.length() + " startOffset " + startOffset + " length "
+ length + " short circuit checksu
re basically logs for
> HDFS_READ and HDFS_WRITE ops. I wanted to see if it's a valid assumption
> that SCR is working if I don't see any clienttrace logs for the RS that is
> hosted on the same box as the DN.
>
> Hopefully I clarified it.
>
> On Fri, Jul 5, 2013 at 1
Hello Stan,
Your bulk load trying to load data to multiple column
families?
-Anoop-
On Wed, Jul 10, 2013 at 11:13 AM, Stack wrote:
> File a bug Stan please. Paste your log snippet and surrounding what is
> going on at the time. It looks broke that a bulk load would be kept o
Can u be little more specific? CPs works per region basis. So it can be
utalized for distinct ops for one region.. Overall at table level if u
want to do, then some work at client side also will be needed..
Have a look at Phoenix. http://forcedotcom.github.io/phoenix/functions.html
On Thu, Ju
1 - 100 of 281 matches
Mail list logo