Re: Moving 2.0 forward

2017-01-13 Thread Andrew Purtell
While I don't disagree that half finished features are undesirable, I'm not 
suggesting that as a strategy so much as we kick out stuff that just doesn't 
seem to be getting done. Pushing 2.0 out another three months is fine if 
there's a good chance this is realistic and we won't be having this discussion 
again then. Let me have a look at the doc and return with specific points for 
further discussion (if any). 


> On Jan 13, 2017, at 11:25 PM, Stack  wrote:
> 
>> On Sat, Dec 31, 2016 at 12:16 PM, Stephen Jiang  
>> wrote:
>> Hello, Andrew, I was a helper on Matteo so that we can help each other
>> while we are focusing on the new Assignment Manager work.  Now he is not
>> available (at least in the next few months).  I have to be more focused on
>> the new AM work; plus other work in my company; it would be too much for me
>> to 2.0 RM alone.  I am happy someone would help to take primary 2.0 RM role
>> while I am still help to make this 2.0 release smooth.
>> 
> 
> (I could help out Stephen. We could co-RM?)
>  
>> For branch-2, I think it is too early to cut it, as we still have a lot of
>> moving parts and on-going project that needs to be part of 2.0.  For
>> example, the mentioned new AM (and other projects, such as HBASE-14414,
>> HBASE-15179, HBASE-14070, HBASE-14850, HBASE-16833, HBASE-15531, just name
>> a few).  Cutting branch now would add burden to complete those projects.
>> 
> 
> Agree with Stephen. A bunch of stuff is half-baked so a '2.0.0' now would be 
> all loose ends and it'd make for a messy narrative.
> 
> I started a doc listing state of 2.0.0: 
> https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_ktczrlKHK8N4SZzs/edit?usp=sharing
> 
> In the doc I made an estimate of what the community considers core 2.0.0 
> items based in part off old lists and after survey of current state of JIRA. 
> The doc is open for comment. Please chime in if I am off or if I am missing 
> something that should be included. I also make a rough estimate on state of 
> each core item.
> 
> I intend to keep up this macro-view doc as we progress on 2.0.0 with 
> reflection where pertinent in JIRA . Suggest we branch only when code compete 
> on the core set most of which are complete or near-so. End-of-February should 
> be time enough (First 2.0.0 RC in at the start of May?).
> 
> Thanks,
> St.Ack
> 
>  
>> thanks
>> Stephen
>> 
>> On Sat, Dec 31, 2016 at 10:54 AM, Andrew Purtell 
>> wrote:
>> 
>> > Hi all,
>> >
>> > I've heard a rumor the co-RM situation with 2.0 may have changed. Can we
>> > get an update from co-RMs Matteo and Steven on their availability and
>> > interest in continuing in this role?
>> >
>> > To assist in moving 2.0 forward I intend to branch branch-2 from master
>> > next week. Unless there is an objection I will take this action under
>> > assumption of lazy consensus. Master branch will be renumbered to
>> > 3.0.0-SNAPSHOT. Once we have a branch-2 I will immediately begin scale
>> > tests and stabilization (via bug fixes or reverts of unfinished work) and
>> > invite interested collaborators to do the same.
>> >
>> >
>> >
> 


Re: Merge and HMerge

2017-01-13 Thread Stack
On Fri, Jan 13, 2017 at 7:16 PM, Stephen Jiang 
wrote:

> Revive this thread
>
> I am in the process of removing Region Server side merge (and split)
> transaction code in master branch; as now we have merge (and split)
> procedure(s) from master doing the same thing.
>
>
Good (Issue?)


> The Merge tool depends on RS-side merge code.  I'd like to use this chance
> to remove the util.Merge tool.  This is for 2.0 and up releases only.
> Deprecation does not work here; as keeping the RS-side merge code would
> have duplicate logic in source code and make the new Assignment manager
> code more complicated.
>
>
Could util.Merge be changed to ask the Master run the merge (via AMv2)?

If you remove the util.Merge tool, how then does an operator ask for a
merge in its absence?

Thanks Stephen

S


> Please let me know whether you have objection.
>
> Thanks
> Stephen
>
> PS.  I could deprecated HMerge code if anyone is really using it.  It has
> its own logic and standalone (supposed to dangerously work offline and
> merge more than 2 regions - the util.Merge and shell not support these
> functionality for now).
>
> On Wed, Nov 16, 2016 at 11:04 AM, Enis Söztutar 
> wrote:
>
> > @Appy what is not clear from above?
> >
> > I think we should get rid of both Merge and HMerge.
> >
> > We should not have any tool which will work in offline mode by going over
> > the HDFS data. Seems very brittle to be broken when things get changed.
> > Only use case I can think of is that somehow you end up with a lot of
> > regions and you cannot bring the cluster back up because of OOMs, etc and
> > you have to reduce the number of regions in offline mode. However, we did
> > not see this kind of thing in any of our customers for the last couple of
> > years so far.
> >
> > I think we should seriously look into improving normalizer and enabling
> > that by default for all the tables. Ideally, normalizer should be running
> > much more frequently, and should be configured with higher-level goals
> and
> > heuristics. Like on average how many regions per node, etc and should be
> > looking at the global state (like the balancer) to decide on split /
> merge
> > points.
> >
> > Enis
> >
> > On Wed, Nov 16, 2016 at 1:17 AM, Apekshit Sharma 
> > wrote:
> >
> > > bq. HMerge can merge multiple regions by going over the list of
> > > regions and checking
> > > their sizes.
> > > bq. But both of these tools (Merge and HMerge) are very dangerous
> > >
> > > I came across HMerge and it looks like dead code. Isn't referenced from
> > > anywhere except one test. (This is what lars also pointed out in the
> > first
> > > email too).
> > > It would make perfect sense if it was a tool or was being referenced
> from
> > > somewhere, but with lack of either of that, am a bit confused here.
> > > @Enis, you seem to know everything about them, please educate me.
> > > Thanks
> > > - Appy
> > >
> > >
> > >
> > > On Thu, Sep 29, 2016 at 12:43 AM, Enis Söztutar 
> > > wrote:
> > >
> > > > Merge has very limited usability singe it can do a single merge and
> can
> > > > only run when HBase is offline.
> > > > HMerge can merge multiple regions by going over the list of regions
> and
> > > > checking their sizes.
> > > > And of course we have the "supported" online merge which is the shell
> > > > command.
> > > >
> > > > But both of these tools (Merge and HMerge) are very dangerous I
> think.
> > I
> > > > would say we should deprecate both to be replaced by the online
> merger
> > > > tool. We should not allow offline merge at all. I fail to see the
> > usecase
> > > > that you have to use an offline merge.
> > > >
> > > > Enis
> > > >
> > > > On Wed, Sep 28, 2016 at 7:32 AM, Lars George 
> > > > wrote:
> > > >
> > > > > Hey,
> > > > >
> > > > > Sorry to resurrect this old thread, but working on the book
> update, I
> > > > > came across the same today, i.e. we have Merge and HMerge. I tried
> > and
> > > > > Merge works fine now. It is also the only one of the two flagged as
> > > > > being a tool. Should HMerge be removed? At least deprecated?
> > > > >
> > > > > Cheers,
> > > > > Lars
> > > > >
> > > > >
> > > > > On Thu, Jul 7, 2011 at 2:03 AM, Ted Yu 
> wrote:
> > > > > >>> there is already an issue to do this but not revamp of these
> > Merge
> > > > > > classes
> > > > > > I guess the issue is HBASE-1621
> > > > > >
> > > > > > On Wed, Jul 6, 2011 at 2:28 PM, Stack  wrote:
> > > > > >
> > > > > >> Yeah, can you file an issue Lars.  This stuff is ancient and
> needs
> > > to
> > > > > >> be redone AND redone so we can do merging while table is online
> > > (there
> > > > > >> is already an issue to do this but not revamp of these Merge
> > > classes).
> > > > > >>  The unit tests for Merge are also all junit3 and do whacky
> stuff
> > to
> > > > > >> put up multiple regions.  This should be redone too 

Re: Moving 2.0 forward

2017-01-13 Thread Stack
On Sat, Dec 31, 2016 at 12:16 PM, Stephen Jiang 
wrote:

> Hello, Andrew, I was a helper on Matteo so that we can help each other
> while we are focusing on the new Assignment Manager work.  Now he is not
> available (at least in the next few months).  I have to be more focused on
> the new AM work; plus other work in my company; it would be too much for me
> to 2.0 RM alone.  I am happy someone would help to take primary 2.0 RM role
> while I am still help to make this 2.0 release smooth.
>
>
(I could help out Stephen. We could co-RM?)


> For branch-2, I think it is too early to cut it, as we still have a lot of
> moving parts and on-going project that needs to be part of 2.0.  For
> example, the mentioned new AM (and other projects, such as HBASE-14414,
> HBASE-15179, HBASE-14070, HBASE-14850, HBASE-16833, HBASE-15531, just name
> a few).  Cutting branch now would add burden to complete those projects.
>
>
Agree with Stephen. A bunch of stuff is half-baked so a '2.0.0' now would
be all loose ends and it'd make for a messy narrative.

I started a doc listing state of 2.0.0:
https://docs.google.com/document/d/1WCsVlnHjJeKUcl7wHwqb4z9iEu_ktczrlKHK8N4SZzs/edit?usp=sharing

In the doc I made an estimate of what the community considers core 2.0.0
items based in part off old lists and after survey of current state of
JIRA. The doc is open for comment. Please chime in if I am off or if I am
missing something that should be included. I also make a rough estimate on
state of each core item.

I intend to keep up this macro-view doc as we progress on 2.0.0 with
reflection where pertinent in JIRA . Suggest we branch only when code
compete on the core set most of which are complete or near-so.
End-of-February should be time enough (First 2.0.0 RC in at the start of
May?).

Thanks,
St.Ack



> thanks
> Stephen
>
> On Sat, Dec 31, 2016 at 10:54 AM, Andrew Purtell  >
> wrote:
>
> > Hi all,
> >
> > I've heard a rumor the co-RM situation with 2.0 may have changed. Can we
> > get an update from co-RMs Matteo and Steven on their availability and
> > interest in continuing in this role?
> >
> > To assist in moving 2.0 forward I intend to branch branch-2 from master
> > next week. Unless there is an objection I will take this action under
> > assumption of lazy consensus. Master branch will be renumbered to
> > 3.0.0-SNAPSHOT. Once we have a branch-2 I will immediately begin scale
> > tests and stabilization (via bug fixes or reverts of unfinished work) and
> > invite interested collaborators to do the same.
> >
> >
> >
>


[jira] [Created] (HBASE-17469) Properly handle empty TableName in TablePermission#readFields and #write

2017-01-13 Thread Ted Yu (JIRA)
Ted Yu created HBASE-17469:
--

 Summary: Properly handle empty TableName in 
TablePermission#readFields and #write
 Key: HBASE-17469
 URL: https://issues.apache.org/jira/browse/HBASE-17469
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu


HBASE-17450 handles the empty table name in equals().

This JIRA is to properly handle empty TableName in TablePermission#readFields() 
and TablePermission#write() methods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Merge and HMerge

2017-01-13 Thread Stephen Jiang
Revive this thread

I am in the process of removing Region Server side merge (and split)
transaction code in master branch; as now we have merge (and split)
procedure(s) from master doing the same thing.

The Merge tool depends on RS-side merge code.  I'd like to use this chance
to remove the util.Merge tool.  This is for 2.0 and up releases only.
Deprecation does not work here; as keeping the RS-side merge code would
have duplicate logic in source code and make the new Assignment manager
code more complicated.

Please let me know whether you have objection.

Thanks
Stephen

PS.  I could deprecated HMerge code if anyone is really using it.  It has
its own logic and standalone (supposed to dangerously work offline and
merge more than 2 regions - the util.Merge and shell not support these
functionality for now).

On Wed, Nov 16, 2016 at 11:04 AM, Enis Söztutar  wrote:

> @Appy what is not clear from above?
>
> I think we should get rid of both Merge and HMerge.
>
> We should not have any tool which will work in offline mode by going over
> the HDFS data. Seems very brittle to be broken when things get changed.
> Only use case I can think of is that somehow you end up with a lot of
> regions and you cannot bring the cluster back up because of OOMs, etc and
> you have to reduce the number of regions in offline mode. However, we did
> not see this kind of thing in any of our customers for the last couple of
> years so far.
>
> I think we should seriously look into improving normalizer and enabling
> that by default for all the tables. Ideally, normalizer should be running
> much more frequently, and should be configured with higher-level goals and
> heuristics. Like on average how many regions per node, etc and should be
> looking at the global state (like the balancer) to decide on split / merge
> points.
>
> Enis
>
> On Wed, Nov 16, 2016 at 1:17 AM, Apekshit Sharma 
> wrote:
>
> > bq. HMerge can merge multiple regions by going over the list of
> > regions and checking
> > their sizes.
> > bq. But both of these tools (Merge and HMerge) are very dangerous
> >
> > I came across HMerge and it looks like dead code. Isn't referenced from
> > anywhere except one test. (This is what lars also pointed out in the
> first
> > email too).
> > It would make perfect sense if it was a tool or was being referenced from
> > somewhere, but with lack of either of that, am a bit confused here.
> > @Enis, you seem to know everything about them, please educate me.
> > Thanks
> > - Appy
> >
> >
> >
> > On Thu, Sep 29, 2016 at 12:43 AM, Enis Söztutar 
> > wrote:
> >
> > > Merge has very limited usability singe it can do a single merge and can
> > > only run when HBase is offline.
> > > HMerge can merge multiple regions by going over the list of regions and
> > > checking their sizes.
> > > And of course we have the "supported" online merge which is the shell
> > > command.
> > >
> > > But both of these tools (Merge and HMerge) are very dangerous I think.
> I
> > > would say we should deprecate both to be replaced by the online merger
> > > tool. We should not allow offline merge at all. I fail to see the
> usecase
> > > that you have to use an offline merge.
> > >
> > > Enis
> > >
> > > On Wed, Sep 28, 2016 at 7:32 AM, Lars George 
> > > wrote:
> > >
> > > > Hey,
> > > >
> > > > Sorry to resurrect this old thread, but working on the book update, I
> > > > came across the same today, i.e. we have Merge and HMerge. I tried
> and
> > > > Merge works fine now. It is also the only one of the two flagged as
> > > > being a tool. Should HMerge be removed? At least deprecated?
> > > >
> > > > Cheers,
> > > > Lars
> > > >
> > > >
> > > > On Thu, Jul 7, 2011 at 2:03 AM, Ted Yu  wrote:
> > > > >>> there is already an issue to do this but not revamp of these
> Merge
> > > > > classes
> > > > > I guess the issue is HBASE-1621
> > > > >
> > > > > On Wed, Jul 6, 2011 at 2:28 PM, Stack  wrote:
> > > > >
> > > > >> Yeah, can you file an issue Lars.  This stuff is ancient and needs
> > to
> > > > >> be redone AND redone so we can do merging while table is online
> > (there
> > > > >> is already an issue to do this but not revamp of these Merge
> > classes).
> > > > >>  The unit tests for Merge are also all junit3 and do whacky stuff
> to
> > > > >> put up multiple regions.  This should be redone too (they are
> often
> > > > >> first thing broke when major change and putting them back together
> > is
> > > > >> a headache since they do not follow the usual pattern).
> > > > >>
> > > > >> St.Ack
> > > > >>
> > > > >> On Sun, Jul 3, 2011 at 12:38 AM, Lars George <
> lars.geo...@gmail.com
> > >
> > > > >> wrote:
> > > > >> > Hi Ted,
> > > > >> >
> > > > >> > The log is from an earlier attempt, I tried this a few times.
> This
> > > is
> > > > all
> > > > >> local, after rm'ing the /hbase. So the files are all pretty empty,
> > but
> 

[jira] [Created] (HBASE-17468) unread messages in TCP connections - possible connection leak

2017-01-13 Thread Shridhar Sahukar (JIRA)
Shridhar Sahukar created HBASE-17468:


 Summary: unread messages in TCP connections - possible connection 
leak
 Key: HBASE-17468
 URL: https://issues.apache.org/jira/browse/HBASE-17468
 Project: HBase
  Issue Type: Bug
Reporter: Shridhar Sahukar
Priority: Critical


We are running HBase 1.2.0-cdh5.7.1 (Cloudera distribution).

On our Hadoop cluster, we are seeing that each HBase region server has large 
number of TCP connections to all the HDFS data nodes and all these connections 
have unread data in socket buffers. Some of these connections are also in 
CLOSE_WAIT or FIN_WAIT1 state while the rest are in ESTABLISHED state.

Looks like HBase is creating some connections requesting data from HDFS, but 
its forgetting about those connections before it could read the data. Thus the 
connections are left lingering around with large data stuck in their receive 
buffers. Also, it seems HDFS closes these connections after a while, but since 
there is data in receive buffer the connection is left in CLOSE_WAIT/FIN_WAIT1 
states.

Below is a snapshot from one of the region servers:

## Total number of connections to HDFS  (pid of region server is 143722)
[bda@md-bdadev-42 hbase]$ sudo netstat -anp|grep 143722 | wc -l
827

## Connections that are not in ESTABLISHED state
[bda@md-bdadev-42 hbase]$ sudo netstat -anp|grep 143722 | grep -v ESTABLISHED | 
wc -l
344

##Snapshot of some of these connections:
tcp   133887  0 146.1.180.43:48533  146.1.180.40:50010  
ESTABLISHED 143722/java
tcp82934  0 146.1.180.43:59647  146.1.180.42:50010  
ESTABLISHED 143722/java
tcp0  0 146.1.180.43:50761  146.1.180.27:2181   
ESTABLISHED 143722/java
tcp   234084  0 146.1.180.43:58335  146.1.180.42:50010  
ESTABLISHED 143722/java
tcp   967667  0 146.1.180.43:56136  146.1.180.68:50010  
ESTABLISHED 143722/java
tcp   156037  0 146.1.180.43:59659  146.1.180.42:50010  
ESTABLISHED 143722/java
tcp   212488  0 146.1.180.43:56810  146.1.180.48:50010  
ESTABLISHED 143722/java
tcp61871  0 146.1.180.43:53593  146.1.180.35:50010  
ESTABLISHED 143722/java
tcp   121216  0 146.1.180.43:35324  146.1.180.38:50010  
ESTABLISHED 143722/java
tcp1  0 146.1.180.43:32982  146.1.180.42:50010  
CLOSE_WAIT  143722/java
tcp82934  0 146.1.180.43:42359  146.1.180.54:50010  
ESTABLISHED 143722/java
tcp   159422  0 146.1.180.43:59731  146.1.180.42:50010  
ESTABLISHED 143722/java
tcp   134573  0 146.1.180.43:60210  146.1.180.76:50010  
ESTABLISHED 143722/java
tcp82934  0 146.1.180.43:59713  146.1.180.42:50010  
ESTABLISHED 143722/java
tcp   135765  0 146.1.180.43:44412  146.1.180.29:50010  
ESTABLISHED 143722/java
tcp   161655  0 146.1.180.43:43117  146.1.180.42:50010  
ESTABLISHED 143722/java
tcp75990  0 146.1.180.43:59729  146.1.180.42:50010  
ESTABLISHED 143722/java
tcp78583  0 146.1.180.43:59971  146.1.180.42:50010  
ESTABLISHED 143722/java
tcp1  0 146.1.180.43:39893  146.1.180.67:50010  
CLOSE_WAIT  143722/java
tcp1  0 146.1.180.43:38834  146.1.180.47:50010  
CLOSE_WAIT  143722/java
tcp1  0 146.1.180.43:40707  146.1.180.50:50010  
CLOSE_WAIT  143722/java
tcp   106102  0 146.1.180.43:48208  146.1.180.75:50010  
ESTABLISHED 143722/java
tcp   332013  0 146.1.180.43:34795  146.1.180.37:50010  
ESTABLISHED 143722/java
tcp1  0 146.1.180.43:57644  146.1.180.67:50010  
CLOSE_WAIT  143722/java
tcp79119  0 146.1.180.43:54438  146.1.180.70:50010  
ESTABLISHED 143722/java
tcp77438  0 146.1.180.43:35259  146.1.180.38:50010  
ESTABLISHED 143722/java
tcp1  0 146.1.180.43:57579  146.1.180.41:50010  
CLOSE_WAIT  143722/java
tcp   318091  0 146.1.180.43:60124  146.1.180.42:50010  
ESTABLISHED 143722/java
tcp1  0 146.1.180.43:51715  146.1.180.70:50010  
CLOSE_WAIT  143722/java
tcp   126519  0 146.1.180.43:36389  146.1.180.49:50010  
ESTABLISHED 143722/java
tcp1  0 146.1.180.43:45656  146.1.180.75:50010  
CLOSE_WAIT  143722/java
tcp   113720  0 146.1.180.43:59741  146.1.180.42:50010  
ESTABLISHED 143722/java
tcp74599  0 146.1.180.43:44192  146.1.180.60:50010  
ESTABLISHED 143722/java
tcp   131224  0 146.1.180.43:53708  146.1.180.44:50010  
ESTABLISHED 143722/java
tcp   1433915  

Re: Region comapction failed

2017-01-13 Thread Ted Yu
w.r.t. #2, I did a quick search for bloom related fixes.

I found HBASE-13123 but it was in 1.0.2

Planning to spend more time in the next few days.

On Fri, Jan 13, 2017 at 5:29 PM, Pankaj kr  wrote:

> Thanks Ted for replying.
>
> Actually issue happened in production environment and there are many
> HFiles in that store (can't get the file). As we don't log the file name
> which is corrupted, Is there anyway to get the corrupted  file name?
>
> Block encoding is "NONE", table schema has bloom filter as "ROW",
> compression type is "Snappy" and durability is SKIP_WAL.
>
>
> Regards,
> Pankaj
>
>
> -Original Message-
> From: Ted Yu [mailto:yuzhih...@gmail.com]
> Sent: Friday, January 13, 2017 10:30 PM
> To: dev@hbase.apache.org
> Cc: u...@hbase.apache.org
> Subject: Re: Region comapction failed
>
> In the second case, the error happened when writing hfile. Can you track
> down the path of the new file so that further investigation can be done ?
>
> Does the table use any encoding ?
>
> Thanks
>
> > On Jan 13, 2017, at 2:47 AM, Pankaj kr  wrote:
> >
> > Hi,
> >
> > We met a weird issue in our production environment.
> >
> > Region compaction is always failing with  following errors,
> >
> > 1.
> > 2017-01-10 02:19:10,427 | ERROR | regionserver/RS-HOST/RS-IP:
> PORT-longCompactions-1483858654825 | Compaction failed Request =
> regionName=., storeName=XYZ, fileCount=6, fileSize=100.7 M (3.2 M, 20.8
> M, 15.1 M, 20.9 M, 21.0 M, 19.7 M), priority=-5, time=1747414906352088 |
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$
> CompactionRunner.doCompaction(CompactSplitThread.java:562)
> > java.io.IOException: ScanWildcardColumnTracker.checkColumn ran into a
> column actually smaller than the previous column:  XXX
> >at org.apache.hadoop.hbase.regionserver.
> ScanWildcardColumnTracker.checkVersions(ScanWildcardColumnTracker.
> java:114)
> >at org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.
> match(ScanQueryMatcher.java:457)
> >at org.apache.hadoop.hbase.regionserver.StoreScanner.
> next(StoreScanner.java:551)
> >at org.apache.hadoop.hbase.regionserver.compactions.
> Compactor.performCompaction(Compactor.java:328)
> >at org.apache.hadoop.hbase.regionserver.compactions.
> DefaultCompactor.compact(DefaultCompactor.java:104)
> >at org.apache.hadoop.hbase.regionserver.
> DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.
> java:133)
> >at org.apache.hadoop.hbase.regionserver.HStore.compact(
> HStore.java:1243)
> >at org.apache.hadoop.hbase.regionserver.HRegion.compact(
> HRegion.java:1895)
> >at org.apache.hadoop.hbase.regionserver.
> CompactSplitThread$CompactionRunner.doCompaction(
> CompactSplitThread.java:546)
> >at org.apache.hadoop.hbase.regionserver.
> CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:583)
> >at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> >at java.util.concurrent.ThreadPoolExecuto
> >
> > 2.
> > 2017-01-10 02:33:53,009 | ERROR | regionserver/RS-HOST/RS-IP:
> PORT-longCompactions-1483686810953 | Compaction failed Request =
> regionName=YY, storeName=ABC, fileCount=6, fileSize=125.3 M (20.9 M,
> 20.9 M, 20.9 M, 20.9 M, 20.9 M, 20.9 M), priority=-68,
> time=1748294500157323 | org.apache.hadoop.hbase.regionserver.
> CompactSplitThread$CompactionRunner.doCompaction(
> CompactSplitThread.java:562)
> > java.io.IOException: Non-increasing Bloom keys: XX
> after 
> >at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.
> appendGeneralBloomfilter(StoreFile.java:911)
> >at org.apache.hadoop.hbase.regionserver.StoreFile$Writer.
> append(StoreFile.java:947)
> >at org.apache.hadoop.hbase.regionserver.compactions.
> Compactor.performCompaction(Compactor.java:337)
> >at org.apache.hadoop.hbase.regionserver.compactions.
> DefaultCompactor.compact(DefaultCompactor.java:104)
> >at org.apache.hadoop.hbase.regionserver.
> DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.
> java:133)
> >at org.apache.hadoop.hbase.regionserver.HStore.compact(
> HStore.java:1243)
> >at org.apache.hadoop.hbase.regionserver.HRegion.compact(
> HRegion.java:1895)
> >at org.apache.hadoop.hbase.regionserver.
> CompactSplitThread$CompactionRunner.doCompaction(
> CompactSplitThread.java:546)
> >at org.apache.hadoop.hbase.regionserver.
> CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:583)
> >at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> >at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> 

RE: Region comapction failed

2017-01-13 Thread Pankaj kr
Thanks Ted for replying.

Actually issue happened in production environment and there are many HFiles in 
that store (can't get the file). As we don't log the file name which is 
corrupted, Is there anyway to get the corrupted  file name? 

Block encoding is "NONE", table schema has bloom filter as "ROW", compression 
type is "Snappy" and durability is SKIP_WAL.


Regards,
Pankaj


-Original Message-
From: Ted Yu [mailto:yuzhih...@gmail.com] 
Sent: Friday, January 13, 2017 10:30 PM
To: dev@hbase.apache.org
Cc: u...@hbase.apache.org
Subject: Re: Region comapction failed

In the second case, the error happened when writing hfile. Can you track down 
the path of the new file so that further investigation can be done ?

Does the table use any encoding ?

Thanks

> On Jan 13, 2017, at 2:47 AM, Pankaj kr  wrote:
> 
> Hi,
> 
> We met a weird issue in our production environment.
> 
> Region compaction is always failing with  following errors,
> 
> 1.
> 2017-01-10 02:19:10,427 | ERROR | 
> regionserver/RS-HOST/RS-IP:PORT-longCompactions-1483858654825 | Compaction 
> failed Request = regionName=., storeName=XYZ, fileCount=6, fileSize=100.7 
> M (3.2 M, 20.8 M, 15.1 M, 20.9 M, 21.0 M, 19.7 M), priority=-5, 
> time=1747414906352088 | 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:562)
> java.io.IOException: ScanWildcardColumnTracker.checkColumn ran into a column 
> actually smaller than the previous column:  XXX
>at 
> org.apache.hadoop.hbase.regionserver.ScanWildcardColumnTracker.checkVersions(ScanWildcardColumnTracker.java:114)
>at 
> org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:457)
>at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:551)
>at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:328)
>at 
> org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:104)
>at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:133)
>at 
> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1243)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1895)
>at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:546)
>at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:583)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at java.util.concurrent.ThreadPoolExecuto
> 
> 2.
> 2017-01-10 02:33:53,009 | ERROR | 
> regionserver/RS-HOST/RS-IP:PORT-longCompactions-1483686810953 | Compaction 
> failed Request = regionName=YY, storeName=ABC, fileCount=6, 
> fileSize=125.3 M (20.9 M, 20.9 M, 20.9 M, 20.9 M, 20.9 M, 20.9 M), 
> priority=-68, time=1748294500157323 | 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:562)
> java.io.IOException: Non-increasing Bloom keys: XX after 
> 
>at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.appendGeneralBloomfilter(StoreFile.java:911)
>at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:947)
>at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:337)
>at 
> org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:104)
>at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:133)
>at 
> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1243)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1895)
>at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:546)
>at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:583)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>at java.lang.Thread.run(Thread.java:745)
> 
> HBase version : 1.0.2
> 
> We have verified all the HFiles in the store using HFilePrettyPrinter with 
> "k" (checkrow), all report is normal. Full scan is also successful.
> We don't have the access to the actual data and may be customer wont agree to 
>  share 

[jira] [Created] (HBASE-17467) HBase Examples: C# DemoClient

2017-01-13 Thread Jeff Saremi (JIRA)
Jeff Saremi created HBASE-17467:
---

 Summary: HBase Examples: C# DemoClient
 Key: HBASE-17467
 URL: https://issues.apache.org/jira/browse/HBASE-17467
 Project: HBase
  Issue Type: Task
  Components: Client
Affects Versions: 1.1.8
Reporter: Jeff Saremi


I am attaching DemoClient.cs which is taken from the C++ version of the same 
file along with  the generated HBase Thrift files (0.9.3). Hoping that someone 
would be using them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17466) [C++] Speed up the tests a bit

2017-01-13 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-17466:
-

 Summary: [C++] Speed up the tests a bit
 Key: HBASE-17466
 URL: https://issues.apache.org/jira/browse/HBASE-17466
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar


The tests takes too long due to sleep and starting / stopping the cluster. We 
can do some speed up. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17465) [C++] implement request retry mechanism over RPC

2017-01-13 Thread Xiaobing Zhou (JIRA)
Xiaobing Zhou created HBASE-17465:
-

 Summary: [C++] implement request retry mechanism over RPC
 Key: HBASE-17465
 URL: https://issues.apache.org/jira/browse/HBASE-17465
 Project: HBase
  Issue Type: Sub-task
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17464) Fix HBaseTestingUtility.getNewDataTestDirOnTestFS to always return a unique path

2017-01-13 Thread Zach York (JIRA)
Zach York created HBASE-17464:
-

 Summary: Fix HBaseTestingUtility.getNewDataTestDirOnTestFS to 
always return a unique path
 Key: HBASE-17464
 URL: https://issues.apache.org/jira/browse/HBASE-17464
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 2.0.0
Reporter: Zach York
Assignee: Zach York
Priority: Minor


Currently, HBaseTestingUtility.getNewDataTestDirOnTestFS() returns a unique 
path only on non-local filesystems. This method should always return a unique 
directory. This bug fix is needed to accurately test HBASE-17437.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-17463) [C++] RpcClient should close the thread pool

2017-01-13 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar resolved HBASE-17463.
---
Resolution: Fixed

> [C++] RpcClient should close the thread pool
> 
>
> Key: HBASE-17463
> URL: https://issues.apache.org/jira/browse/HBASE-17463
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: HBASE-14850
>
> Attachments: hbase-17463_v1.patch
>
>
> RpcClient and connection pool should close their resources. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17463) [C++] RpcClient sholud close the thread pools

2017-01-13 Thread Enis Soztutar (JIRA)
Enis Soztutar created HBASE-17463:
-

 Summary: [C++] RpcClient sholud close the thread pools
 Key: HBASE-17463
 URL: https://issues.apache.org/jira/browse/HBASE-17463
 Project: HBase
  Issue Type: Sub-task
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: HBASE-14850


RpcClient and connection pool should close their resources. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17462) Investigate using sliding window for read/write request costs in StochasticLoadBalancer

2017-01-13 Thread Ted Yu (JIRA)
Ted Yu created HBASE-17462:
--

 Summary: Investigate using sliding window for read/write request 
costs in StochasticLoadBalancer
 Key: HBASE-17462
 URL: https://issues.apache.org/jira/browse/HBASE-17462
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu


In the thread, http://search-hadoop.com/m/HBase/YGbbyUZKXWALkX1, Timothy was 
asking whether the read/write request costs in StochasticLoadBalancer should be 
calculated as rates.

This makes sense since read / write load on region server tends to fluctuate 
over time. Using sliding window would reflect more recent trend in read / write 
load.

Some factors to consider:

The data structure used by StochasticLoadBalancer should be concise. The
number of regions in a cluster can be expected to approach 1 million. We
cannot afford to store long history of read / write requests in master.

Efficiency of cost calculation should be high - there're many cost
functions the balancer goes through, it is expected for each cost function
to return quickly. Otherwise we would not come up with proper region
movement plan(s) in time.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Successful: HBase Generate Website

2017-01-13 Thread Apache Jenkins Server
Build status: Successful

If successful, the website and docs have been generated. To update the live 
site, follow the instructions below. If failed, skip to the bottom of this 
email.

Use the following commands to download the patch and apply it to a clean branch 
based on origin/asf-site. If you prefer to keep the hbase-site repo around 
permanently, you can skip the clone step.

  git clone https://git-wip-us.apache.org/repos/asf/hbase-site.git

  cd hbase-site
  wget -O- 
https://builds.apache.org/job/hbase_generate_website/460/artifact/website.patch.zip
 | funzip > 2f8ddf6fc5f904f0273b07469286e01aa02c7da5.patch
  git fetch
  git checkout -b asf-site-2f8ddf6fc5f904f0273b07469286e01aa02c7da5 
origin/asf-site
  git am --whitespace=fix 2f8ddf6fc5f904f0273b07469286e01aa02c7da5.patch

At this point, you can preview the changes by opening index.html or any of the 
other HTML pages in your local 
asf-site-2f8ddf6fc5f904f0273b07469286e01aa02c7da5 branch.

There are lots of spurious changes, such as timestamps and CSS styles in 
tables, so a generic git diff is not very useful. To see a list of files that 
have been added, deleted, renamed, changed type, or are otherwise interesting, 
use the following command:

  git diff --name-status --diff-filter=ADCRTXUB origin/asf-site

To see only files that had 100 or more lines changed:

  git diff --stat origin/asf-site | grep -E '[1-9][0-9]{2,}'

When you are satisfied, publish your changes to origin/asf-site using these 
commands:

  git commit --allow-empty -m "Empty commit" # to work around a current ASF 
INFRA bug
  git push origin asf-site-2f8ddf6fc5f904f0273b07469286e01aa02c7da5:asf-site
  git checkout asf-site
  git branch -D asf-site-2f8ddf6fc5f904f0273b07469286e01aa02c7da5

Changes take a couple of minutes to be propagated. You can verify whether they 
have been propagated by looking at the Last Published date at the bottom of 
http://hbase.apache.org/. It should match the date in the index.html on the 
asf-site branch in Git.

As a courtesy- reply-all to this email to let other committers know you pushed 
the site.



If failed, see https://builds.apache.org/job/hbase_generate_website/460/console

Re: Region comapction failed

2017-01-13 Thread Ted Yu
In the second case, the error happened when writing hfile. Can you track down 
the path of the new file so that further investigation can be done ?

Does the table use any encoding ?

Thanks

> On Jan 13, 2017, at 2:47 AM, Pankaj kr  wrote:
> 
> Hi,
> 
> We met a weird issue in our production environment.
> 
> Region compaction is always failing with  following errors,
> 
> 1.
> 2017-01-10 02:19:10,427 | ERROR | 
> regionserver/RS-HOST/RS-IP:PORT-longCompactions-1483858654825 | Compaction 
> failed Request = regionName=., storeName=XYZ, fileCount=6, fileSize=100.7 
> M (3.2 M, 20.8 M, 15.1 M, 20.9 M, 21.0 M, 19.7 M), priority=-5, 
> time=1747414906352088 | 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:562)
> java.io.IOException: ScanWildcardColumnTracker.checkColumn ran into a column 
> actually smaller than the previous column:  XXX
>at 
> org.apache.hadoop.hbase.regionserver.ScanWildcardColumnTracker.checkVersions(ScanWildcardColumnTracker.java:114)
>at 
> org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:457)
>at 
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:551)
>at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:328)
>at 
> org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:104)
>at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:133)
>at 
> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1243)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1895)
>at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:546)
>at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:583)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at java.util.concurrent.ThreadPoolExecuto
> 
> 2.
> 2017-01-10 02:33:53,009 | ERROR | 
> regionserver/RS-HOST/RS-IP:PORT-longCompactions-1483686810953 | Compaction 
> failed Request = regionName=YY, storeName=ABC, fileCount=6, 
> fileSize=125.3 M (20.9 M, 20.9 M, 20.9 M, 20.9 M, 20.9 M, 20.9 M), 
> priority=-68, time=1748294500157323 | 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:562)
> java.io.IOException: Non-increasing Bloom keys: XX after 
> 
>at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.appendGeneralBloomfilter(StoreFile.java:911)
>at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:947)
>at 
> org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:337)
>at 
> org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:104)
>at 
> org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:133)
>at 
> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1243)
>at 
> org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1895)
>at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:546)
>at 
> org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:583)
>at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>at java.lang.Thread.run(Thread.java:745)
> 
> HBase version : 1.0.2
> 
> We have verified all the HFiles in the store using HFilePrettyPrinter with 
> "k" (checkrow), all report is normal. Full scan is also successful.
> We don't have the access to the actual data and may be customer wont agree to 
>  share that .
> 
> Have anyone faced this issue, any pointers will be much appreciated.
> 
> Thanks & Regards,
> Pankaj


[jira] [Created] (HBASE-17461) HBase shell *major_compact* command should properly convert *table_or_region_name* parameter to java byte array properly before simply calling *HBaseAdmin.majorCompact*

2017-01-13 Thread Wellington Chevreuil (JIRA)
Wellington Chevreuil created HBASE-17461:


 Summary: HBase shell *major_compact* command should properly 
convert *table_or_region_name* parameter to java byte array properly before 
simply calling *HBaseAdmin.majorCompact* method
 Key: HBASE-17461
 URL: https://issues.apache.org/jira/browse/HBASE-17461
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: Wellington Chevreuil


On HBase shell, *major_compact* command simply passes the received 
*table_or_region_name* parameter straight to java *HBaseAdmin.majorCompact* 
method.

On some corner cases, HBase tables row keys may have special characters. Then, 
if a region is split in such a way that row keys with special characters are 
now part of the region name, calling *major_compact* on this regions will fail, 
if the special character ASCII code is higher than 127. This happens because 
Java byte type is signed, while ruby byte type isn't, causing the region name 
to be converted to a wrong string at Java side.

For example, considering a region named as below:

{noformat}
test,\xF8\xB9B2!$\x9C\x0A\xFEG\xC0\xE3\x8B\x1B\xFF\x15,1481745228583.b4bc69356d89018bfad3ee106b717285.
{noformat} 

Calling major_compat on it fails as follows:

{noformat}
hbase(main):008:0* major_compact 
"test,\xF8\xB9B2!$\x9C\x0A\xFEG\xC0\xE3\x8B\x1B\xFF\x15,1484177359169.8128fa75ae0cd4eba38da2667ac8ec98."

ERROR: Illegal character code:44, <,> at 4. User-space table qualifiers can 
only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: test,�B2!$�
�G���1484177359169.8128fa75ae0cd4eba38da2667ac8ec98.
{noformat}

An easy solution is to convert *table_or_region_name* parameter properly, prior 
to calling *HBaseAdmin.majorCompact* in the same way as it's already done on 
some other shell commands, such as *get*:

{noformat}
admin.major_compact(table_or_region_name.to_s.to_java_bytes, family)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Region comapction failed

2017-01-13 Thread Pankaj kr
Hi,

We met a weird issue in our production environment.

Region compaction is always failing with  following errors,

1.
2017-01-10 02:19:10,427 | ERROR | 
regionserver/RS-HOST/RS-IP:PORT-longCompactions-1483858654825 | Compaction 
failed Request = regionName=., storeName=XYZ, fileCount=6, fileSize=100.7 M 
(3.2 M, 20.8 M, 15.1 M, 20.9 M, 21.0 M, 19.7 M), priority=-5, 
time=1747414906352088 | 
org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:562)
java.io.IOException: ScanWildcardColumnTracker.checkColumn ran into a column 
actually smaller than the previous column:  XXX
at 
org.apache.hadoop.hbase.regionserver.ScanWildcardColumnTracker.checkVersions(ScanWildcardColumnTracker.java:114)
at 
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(ScanQueryMatcher.java:457)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:551)
at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:328)
at 
org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:104)
at 
org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:133)
at 
org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1243)
at 
org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1895)
at 
org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:546)
at 
org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:583)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecuto

2.
2017-01-10 02:33:53,009 | ERROR | 
regionserver/RS-HOST/RS-IP:PORT-longCompactions-1483686810953 | Compaction 
failed Request = regionName=YY, storeName=ABC, fileCount=6, fileSize=125.3 
M (20.9 M, 20.9 M, 20.9 M, 20.9 M, 20.9 M, 20.9 M), priority=-68, 
time=1748294500157323 | 
org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:562)
java.io.IOException: Non-increasing Bloom keys: XX after 

at 
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.appendGeneralBloomfilter(StoreFile.java:911)
at 
org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:947)
at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.performCompaction(Compactor.java:337)
at 
org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:104)
at 
org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:133)
at 
org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1243)
at 
org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1895)
at 
org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:546)
at 
org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:583)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

HBase version : 1.0.2

We have verified all the HFiles in the store using HFilePrettyPrinter with "k" 
(checkrow), all report is normal. Full scan is also successful.
We don't have the access to the actual data and may be customer wont agree to  
share that .

Have anyone faced this issue, any pointers will be much appreciated.

Thanks & Regards,
Pankaj


[jira] [Created] (HBASE-17460) enable_table_replication can not perform cyclic replication of a table

2017-01-13 Thread NITIN VERMA (JIRA)
NITIN VERMA created HBASE-17460:
---

 Summary: enable_table_replication can not perform cyclic 
replication of a table
 Key: HBASE-17460
 URL: https://issues.apache.org/jira/browse/HBASE-17460
 Project: HBase
  Issue Type: Bug
  Components: Replication
Reporter: NITIN VERMA


The enable_table_replication operation is broken for cyclic replication of 
HBase table as we compare all the properties of column families (including 
REPLICATION_SCOPE). 
Below is exactly what happens:
1.  Running "enable_table_replication 'table1'  " opeartion on first cluster 
will set the REPLICATION_SCOPE of all column families to peer id '1'. This will 
also create a table on second cluster where REPLICATION_SCOPE is still set to 
peer id '0'.

2. Now when we run "enable_table_replication 'table1'" on second cluster, we 
compare all the properties of table (including REPLICATION_SCOPE_, which 
obviously is different now. 

I am proposing a fix for this issue where we should avoid comparing 
REPLICATION_SCOPE inside HColumnDescriotor::compareTo() method, especially when 
replication is not already enabled on the desired table.

I have made that change and it is working. I will submit the patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)