[jira] [Commented] (HBASE-18128) compaction marker could be skipped
[ https://issues.apache.org/jira/browse/HBASE-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036561#comment-16036561 ] Anoop Sam John commented on HBASE-18128: If only compaction marker we dont consider the seqId, that is ok as of now said ur comment.. Pls add comments why so.. bq. Is it possible that this kind of a meta cell getting clubbed together with normal cell(s)? I think NO. Can you pls check and confirm once? > compaction marker could be skipped > --- > > Key: HBASE-18128 > URL: https://issues.apache.org/jira/browse/HBASE-18128 > Project: HBase > Issue Type: Improvement > Components: Compaction, regionserver >Reporter: Jingyun Tian >Assignee: Jingyun Tian > Attachments: HBASE-18128.patch > > > The sequence for a compaction are as follows: > 1. Compaction writes new files under region/.tmp directory (compaction output) > 2. Compaction atomically moves the temporary file under region directory > 3. Compaction appends a WAL edit containing the compaction input and output > files. Forces sync on WAL. > 4. Compaction deletes the input files from the region directory. > But if a flush happened between 3 and 4, then the regionserver crushed. The > compaction marker will be skipped when splitting log because the sequence id > of compaction marker is smaller than lastFlushedSequenceId. > {code} > if (lastFlushedSequenceId >= entry.getKey().getLogSeqNum()) { > editsSkipped++; > continue; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036551#comment-16036551 ] Anoop Sam John commented on HBASE-18158: inMemoryFlushInProgress is reset in flushInMemory finally any way.. Ya make sense.. Excellent find! > Two running in-memory compaction threads may lose data > -- > > Key: HBASE-18158 > URL: https://issues.apache.org/jira/browse/HBASE-18158 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0 > > Attachments: HBASE-18158.v0.patch > > > {code:title=CompactingMemStore.java|borderStyle=solid} > private void stopCompaction() { > if (inMemoryFlushInProgress.get()) { > compactor.stop(); > inMemoryFlushInProgress.set(false); > } > } > {code} > The stopCompaction() set inMemoryFlushInProgress to false so there may be two > in-memory compaction threads which execute simultaneously. If there are two > running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the > versionedList. > {code:title=MemStoreCompactor.java|borderStyle=solid} > public boolean start() throws IOException { > if (!compactingMemStore.hasImmutableSegments()) { // no compaction on > empty pipeline > return false; > } > // get a snapshot of the list of the segments from the pipeline, > // this local copy of the list is marked with specific version > versionedList = compactingMemStore.getImmutableSegments(); > {code} > And the first InMemoryFlushRunnable will use the chagned versionedList to > remove the corresponding segments. > {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} > if (!isInterrupted.get()) { > if (resultSwapped = compactingMemStore.swapCompactedSegments( > versionedList, result, (action==Action.MERGE))) { > // update the wal so it can be truncated and not get too long > compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // > only if greater > } > } > {code} > In conclusion, first InMemoryFlushRunnable will remove the worng segment. And > the later InMemoryFlushRunnable may introduce NPE because first > InMemoryFlushRunnable set versionedList to null after compaction. > {code} > Exception in thread > "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" > java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036510#comment-16036510 ] Chia-Ping Tsai commented on HBASE-18158: [~anastas] Would you please take a look? Thanks > Two running in-memory compaction threads may lose data > -- > > Key: HBASE-18158 > URL: https://issues.apache.org/jira/browse/HBASE-18158 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0 > > Attachments: HBASE-18158.v0.patch > > > {code:title=CompactingMemStore.java|borderStyle=solid} > private void stopCompaction() { > if (inMemoryFlushInProgress.get()) { > compactor.stop(); > inMemoryFlushInProgress.set(false); > } > } > {code} > The stopCompaction() set inMemoryFlushInProgress to false so there may be two > in-memory compaction threads which execute simultaneously. If there are two > running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the > versionedList. > {code:title=MemStoreCompactor.java|borderStyle=solid} > public boolean start() throws IOException { > if (!compactingMemStore.hasImmutableSegments()) { // no compaction on > empty pipeline > return false; > } > // get a snapshot of the list of the segments from the pipeline, > // this local copy of the list is marked with specific version > versionedList = compactingMemStore.getImmutableSegments(); > {code} > And the first InMemoryFlushRunnable will use the chagned versionedList to > remove the corresponding segments. > {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} > if (!isInterrupted.get()) { > if (resultSwapped = compactingMemStore.swapCompactedSegments( > versionedList, result, (action==Action.MERGE))) { > // update the wal so it can be truncated and not get too long > compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // > only if greater > } > } > {code} > In conclusion, first InMemoryFlushRunnable will remove the worng segment. And > the later InMemoryFlushRunnable may introduce NPE because first > InMemoryFlushRunnable set versionedList to null after compaction. > {code} > Exception in thread > "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" > java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036509#comment-16036509 ] Anoop Sam John commented on HBASE-18145: +1 > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.branch-1.v0.patch, HBASE-18145.v0.patch, > HBASE-18145.v1.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18061) [C++] Fix retry logic in multi-get calls
[ https://issues.apache.org/jira/browse/HBASE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudeep Sunthankar updated HBASE-18061: -- Attachment: HBASE-18061.HBASE-14850.v1.patch This patch consists of: # Added exception-utils for ShouldRetry() method # Added checks for std::out_of_range and std::runtime_error in ShouldRetry() # Standardized std::exception to folly::exception_wrapper in AsyncBatchRpcRetryingCaller and helper classes. # Replaced ShouldRetry() in AsyncRpcRetryingCaller() with ExceptionUtils::ShouldRetry() # Changed Makefile to link krb5, ssl, crypto and pthread libraries Thanks. > [C++] Fix retry logic in multi-get calls > > > Key: HBASE-18061 > URL: https://issues.apache.org/jira/browse/HBASE-18061 > Project: HBase > Issue Type: Sub-task >Reporter: Enis Soztutar >Assignee: Sudeep Sunthankar > Fix For: HBASE-14850 > > Attachments: HBASE-18061.HBASE-14850.v1.patch > > > HBASE-17576 adds multi-gets. There are a couple of todos to fix in the retry > logic, and some unit testing to be done for the multi-gets. We'll do these in > this issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18061) [C++] Fix retry logic in multi-get calls
[ https://issues.apache.org/jira/browse/HBASE-18061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudeep Sunthankar updated HBASE-18061: -- Summary: [C++] Fix retry logic in multi-get calls (was: Fix retry logic in multi-get calls) > [C++] Fix retry logic in multi-get calls > > > Key: HBASE-18061 > URL: https://issues.apache.org/jira/browse/HBASE-18061 > Project: HBase > Issue Type: Sub-task >Reporter: Enis Soztutar >Assignee: Sudeep Sunthankar > Fix For: HBASE-14850 > > > HBASE-17576 adds multi-gets. There are a couple of todos to fix in the retry > logic, and some unit testing to be done for the multi-gets. We'll do these in > this issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18159) Oracle JDK 7 is now only available for those with an Oracle Support account
[ https://issues.apache.org/jira/browse/HBASE-18159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036485#comment-16036485 ] Duo Zhang commented on HBASE-18159: --- Two solutions 1. Use OpenJDK instead. 2. Put JDK7 and JDK6 to a place where we can download directly and modify the docker image file to download the binaries to the right place before running the installer. But I do not know if this is legal. > Oracle JDK 7 is now only available for those with an Oracle Support account > --- > > Key: HBASE-18159 > URL: https://issues.apache.org/jira/browse/HBASE-18159 > Project: HBase > Issue Type: Bug >Reporter: Chia-Ping Tsai >Priority: Critical > > Context: https://builds.apache.org/job/PreCommit-HBASE-Build/7064/console > Ref: http://www.webupd8.org/2017/06/why-oracle-java-7-and-6-installers-no.html > I make this as critical because it is hard to get a +1 from HadoopQA for > branch-1. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036482#comment-16036482 ] Chia-Ping Tsai commented on HBASE-18145: [~anoop.hbase] Any suggestions for v1.patch? Thanks. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.branch-1.v0.patch, HBASE-18145.v0.patch, > HBASE-18145.v1.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036481#comment-16036481 ] Chia-Ping Tsai commented on HBASE-18145: The oracle-java7-installer has some problem. (ref: http://www.webupd8.org/2017/06/why-oracle-java-7-and-6-installers-no.html) Open HBASE-18159 to discuss the problem. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.branch-1.v0.patch, HBASE-18145.v0.patch, > HBASE-18145.v1.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18159) Oracle JDK 7 is now only available for those with an Oracle Support account
Chia-Ping Tsai created HBASE-18159: -- Summary: Oracle JDK 7 is now only available for those with an Oracle Support account Key: HBASE-18159 URL: https://issues.apache.org/jira/browse/HBASE-18159 Project: HBase Issue Type: Bug Reporter: Chia-Ping Tsai Priority: Critical Context: https://builds.apache.org/job/PreCommit-HBASE-Build/7064/console Ref: http://www.webupd8.org/2017/06/why-oracle-java-7-and-6-installers-no.html I make this as critical because it is hard to get a +1 from HadoopQA for branch-1. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-15576) Scanning cursor to prevent blocking long time on ResultScanner.next()
[ https://issues.apache.org/jira/browse/HBASE-15576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036480#comment-16036480 ] Phil Yang commented on HBASE-15576: --- All UTs pass. Any comments? Thanks. > Scanning cursor to prevent blocking long time on ResultScanner.next() > - > > Key: HBASE-15576 > URL: https://issues.apache.org/jira/browse/HBASE-15576 > Project: HBase > Issue Type: New Feature >Reporter: Phil Yang >Assignee: Phil Yang > Fix For: 2.0.0, 1.4.0 > > Attachments: HBASE-15576.v01.patch, HBASE-15576.v02.patch, > HBASE-15576.v03.patch, HBASE-15576.v03.patch, HBASE-15576.v04.patch, > HBASE-15576.v04.patch, HBASE-15576.v05.patch, HBASE-15576.v06.patch > > > After 1.1.0 released, we have partial and heartbeat protocol in scanning to > prevent responding large data or timeout. Now for ResultScanner.next(), we > may block for longer time larger than timeout settings to get a Result if the > row is very large, or filter is sparse, or there are too many delete markers > in files. > However, in some scenes, we don't want it to be blocked for too long. For > example, a web service which handles requests from mobile devices whose > network is not stable and we can not set timeout too long(eg. only 5 seconds) > between mobile and web service. This service will scan rows from HBase and > return it to mobile devices. In this scene, the simplest way is to make the > web service stateless. Apps in mobile devices will send several requests one > by one to get the data until enough just like paging a list. In each request > it will carry a start position which depends on the last result from web > service. Different requests can be sent to different web service server > because it is stateless. > Therefore, the stateless web service need a cursor from HBase telling where > we have scanned in RegionScanner when HBase client receives an empty > heartbeat. And the service will return the cursor to mobile device although > the response has no data. In next request we can start at the position of > cursor, without the cursor we have to scan from last returned result and we > may timeout forever. And of course even if the heartbeat message is not empty > we can still use cursor to prevent re-scan the same rows/cells which has beed > skipped. > Obviously, we will give up consistency for scanning because even HBase client > is also stateless, but it is acceptable in this scene. And maybe we can keep > mvcc in cursor so we can get a consistent view? > HBASE-13099 had some discussion, but it has no further progress by now. > API: > In Scan we need a new method setNeedCursorResult(true) to get the cursor row > key when there is a RPC response but client can not return any Result. In > this mode we will not block ResultScanner.next() longer than this timeout > setting. > {code} > while (r = scanner.next() && r != null) { > if(r.isCursor()){ > // scanning is not end, it is a cursor, save its row key and close scanner > if you want, or > // just continue the loop to call next(). > } else { > // just like before > } > } > // scanning is end > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18038) Rename StoreFile to HStoreFile and add a StoreFile interface for CP
[ https://issues.apache.org/jira/browse/HBASE-18038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-18038: -- Attachment: HBASE-18038-v6.patch Rebase. Ping [~stack] [~apurtell] > Rename StoreFile to HStoreFile and add a StoreFile interface for CP > --- > > Key: HBASE-18038 > URL: https://issues.apache.org/jira/browse/HBASE-18038 > Project: HBase > Issue Type: Sub-task > Components: Coprocessors, regionserver >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-18038.patch, HBASE-18038-v1.patch, > HBASE-18038-v1.patch, HBASE-18038-v2.patch, HBASE-18038-v3.patch, > HBASE-18038-v3.patch, HBASE-18038-v4.patch, HBASE-18038-v5.patch, > HBASE-18038-v6.patch > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036477#comment-16036477 ] Hadoop QA commented on HBASE-18145: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 1m 21s {color} | {color:red} Docker failed to build yetus/hbase:58c504e. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871175/HBASE-18145.branch-1.v0.patch | | JIRA Issue | HBASE-18145 | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7064/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.branch-1.v0.patch, HBASE-18145.v0.patch, > HBASE-18145.v1.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18145: --- Attachment: HBASE-18145.branch-1.v0.patch > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.branch-1.v0.patch, HBASE-18145.v0.patch, > HBASE-18145.v1.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18145: --- Status: Patch Available (was: Open) > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.branch-1.v0.patch, HBASE-18145.v0.patch, > HBASE-18145.v1.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036474#comment-16036474 ] Chia-Ping Tsai commented on HBASE-18145: v1.patch # correct the typo # scannerForDelayedClose -> scannersForDelayedClose # close the men scanners after recreate the precell for avoiding creating the corrupt precell for avoiding creating the corrupt cell # close(boolean withHeapClose) -> close(withDelayedScannersClose) # use only one scanner list to save the delayed close > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch, HBASE-18145.v1.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18145: --- Status: Open (was: Patch Available) > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch, HBASE-18145.v1.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18156) Provide a tool to show block cache summary
[ https://issues.apache.org/jira/browse/HBASE-18156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-18156: --- Attachment: HBASE-18156.v2.patch Modified according to review comments. Thanks, [~tedyu]. > Provide a tool to show block cache summary > -- > > Key: HBASE-18156 > URL: https://issues.apache.org/jira/browse/HBASE-18156 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0, 1.4.0 >Reporter: Allan Yang >Assignee: Allan Yang > Attachments: HBASE-18156.patch, HBASE-18156.v2.patch > > > HBASE-17757 is already committed. But since there is no easy way to show the > size distribution of cached blocks, it is hard to decide the unified size > should be used. > Here I provide a tool to show the details of size distribution of cached > blocks. This tool is well used in our production environment. It is a jsp > page summaries the cache details like this > {code} > BlockCache type:org.apache.hadoop.hbase.io.hfile.LruBlockCache > LruBlockCache > Total size:28.40 GB > Current size:22.49 GB > MetaBlock size:1.56 GB > Free size:5.91 GB > Block count:152684 > Size distribution summary: > BlockCacheSizeDistributionSummary [0 B<=blocksize<4 KB, blocks=833, > heapSize=1.19 MB] > BlockCacheSizeDistributionSummary [4 KB<=blocksize<8 KB, blocks=65, > heapSize=310.83 KB] > BlockCacheSizeDistributionSummary [8 KB<=blocksize<12 KB, blocks=175, > heapSize=1.46 MB] > BlockCacheSizeDistributionSummary [12 KB<=blocksize<16 KB, blocks=18, > heapSize=267.43 KB] > BlockCacheSizeDistributionSummary [16 KB<=blocksize<20 KB, blocks=512, > heapSize=8.30 MB] > BlockCacheSizeDistributionSummary [20 KB<=blocksize<24 KB, blocks=22, > heapSize=499.66 KB] > BlockCacheSizeDistributionSummary [24 KB<=blocksize<28 KB, blocks=24, > heapSize=632.59 KB] > BlockCacheSizeDistributionSummary [28 KB<=blocksize<32 KB, blocks=34, > heapSize=1.02 MB] > BlockCacheSizeDistributionSummary [32 KB<=blocksize<36 KB, blocks=31, > heapSize=1.02 MB] > BlockCacheSizeDistributionSummary [36 KB<=blocksize<40 KB, blocks=22, > heapSize=838.58 KB] > BlockCacheSizeDistributionSummary [40 KB<=blocksize<44 KB, blocks=28, > heapSize=1.15 MB] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (HBASE-18128) compaction marker could be skipped
[ https://issues.apache.org/jira/browse/HBASE-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036458#comment-16036458 ] Jingyun Tian edited comment on HBASE-18128 at 6/5/17 1:54 AM: -- [~anoop.hbase] Thanks for comment, I think solve this in a generic way is a good idea, I will update the patch this week. Only compaction marker need this treatment right now. Because when you replay a recovered edit, server will check if it is compaction marker, if it is, server will delete the old files before opening region. Currently there are no other meta cells have special treatment. was (Author: tianjingyun): [~anoop.hbase] Thanks for comment, I think solve this in a generic way is a good idea, I will update the patch this work. Only compaction marker need this treatment right now. Because when you replay a recovered edit, server will check if it is compaction marker, if it is, server will delete the old files before opening region. Currently there are no other meta cells have special treatment. > compaction marker could be skipped > --- > > Key: HBASE-18128 > URL: https://issues.apache.org/jira/browse/HBASE-18128 > Project: HBase > Issue Type: Improvement > Components: Compaction, regionserver >Reporter: Jingyun Tian >Assignee: Jingyun Tian > Attachments: HBASE-18128.patch > > > The sequence for a compaction are as follows: > 1. Compaction writes new files under region/.tmp directory (compaction output) > 2. Compaction atomically moves the temporary file under region directory > 3. Compaction appends a WAL edit containing the compaction input and output > files. Forces sync on WAL. > 4. Compaction deletes the input files from the region directory. > But if a flush happened between 3 and 4, then the regionserver crushed. The > compaction marker will be skipped when splitting log because the sequence id > of compaction marker is smaller than lastFlushedSequenceId. > {code} > if (lastFlushedSequenceId >= entry.getKey().getLogSeqNum()) { > editsSkipped++; > continue; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17359) Implement async admin
[ https://issues.apache.org/jira/browse/HBASE-17359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036457#comment-16036457 ] Guanghao Zhang commented on HBASE-17359: https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html. Seems TestAsync*Admin unit tests are not flaky anymore. So the only block thing is HBASE-18052 to add doc and example for async admin. > Implement async admin > - > > Key: HBASE-17359 > URL: https://issues.apache.org/jira/browse/HBASE-17359 > Project: HBase > Issue Type: Umbrella > Components: Client >Reporter: Duo Zhang >Assignee: Guanghao Zhang > Labels: asynchronous > Fix For: 2.0.0 > > > And as we will return a CompletableFuture, I think we can just remove the > XXXAsync methods, and make all the methods blocking which means we will only > finish the CompletableFuture when the operation is done. User can choose > whether to wait on the returned CompletableFuture. > Convert this to a umbrella task. There maybe some sub-tasks. > 1. Table admin operations. > 2. Region admin operations. > 3. Namespace admin operations. > 4. Snapshot admin operations. > 5. Replication admin operations. > 6. Other operations, like quota, balance.. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18128) compaction marker could be skipped
[ https://issues.apache.org/jira/browse/HBASE-18128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036458#comment-16036458 ] Jingyun Tian commented on HBASE-18128: -- [~anoop.hbase] Thanks for comment, I think solve this in a generic way is a good idea, I will update the patch this work. Only compaction marker need this treatment right now. Because when you replay a recovered edit, server will check if it is compaction marker, if it is, server will delete the old files before opening region. Currently there are no other meta cells have special treatment. > compaction marker could be skipped > --- > > Key: HBASE-18128 > URL: https://issues.apache.org/jira/browse/HBASE-18128 > Project: HBase > Issue Type: Improvement > Components: Compaction, regionserver >Reporter: Jingyun Tian >Assignee: Jingyun Tian > Attachments: HBASE-18128.patch > > > The sequence for a compaction are as follows: > 1. Compaction writes new files under region/.tmp directory (compaction output) > 2. Compaction atomically moves the temporary file under region directory > 3. Compaction appends a WAL edit containing the compaction input and output > files. Forces sync on WAL. > 4. Compaction deletes the input files from the region directory. > But if a flush happened between 3 and 4, then the regionserver crushed. The > compaction marker will be skipped when splitting log because the sequence id > of compaction marker is smaller than lastFlushedSequenceId. > {code} > if (lastFlushedSequenceId >= entry.getKey().getLogSeqNum()) { > editsSkipped++; > continue; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18097) Save bandwidth on partial_flag_per_result in ScanResponse proto
[ https://issues.apache.org/jira/browse/HBASE-18097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036448#comment-16036448 ] Karan Mehta commented on HBASE-18097: - The problem can occur if the client wants results in a specified batch size, in which case, the results can contain multiple partial results, which is then left to the user to handle appropriately, based on the {{partial}} flag inside the result. This is usually the case with AsyncHBaseClient. > Save bandwidth on partial_flag_per_result in ScanResponse proto > --- > > Key: HBASE-18097 > URL: https://issues.apache.org/jira/browse/HBASE-18097 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0, 1.4.0 >Reporter: Karan Mehta >Assignee: Karan Mehta > > Currently the {{ScanResponse}} proto sends out 1 bit per {{Result}} that it > has embeds inside the {{CellScanner}} to indicate if it is partial or not. > {code} > // In every RPC response there should be at most a single partial result. > Furthermore, if > // there is a partial result, it is guaranteed to be in the last position > of the array. > {code} > According to client, only the last result can be partial, thus this repeated > bool can be converted to a bool, thus reducing overhead of serialization and > deserialization of the array. This will break wire compatibility therefore > this is something to look for in upcoming versions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036365#comment-16036365 ] Hadoop QA commented on HBASE-18158: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 44s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 31m 58s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 134m 50s {color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 181m 36s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871157/HBASE-18158.v0.patch | | JIRA Issue | HBASE-18158 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 447ee3eefae9 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh | | git revision | master / e65d865 | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7062/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7062/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Two running in-memory compaction threads may lose data > -- > > Key: HBASE-18158 > URL: https://issues.apache.org/jira/browse/HBASE-18158 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0 > > Attachments: HBASE-18158.v0.patch > > >
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036363#comment-16036363 ] Hadoop QA commented on HBASE-18145: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 10s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 8s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 31m 57s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 130m 43s {color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 178m 1s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871158/HBASE-18145.v1.patch | | JIRA Issue | HBASE-18145 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 1e160ba33678 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / e65d865 | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/7061/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7061/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments:
[jira] [Commented] (HBASE-18109) Assign system tables first (priority)
[ https://issues.apache.org/jira/browse/HBASE-18109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036351#comment-16036351 ] Yi Liang commented on HBASE-18109: -- yes, the patch can not address Phil Yang's problem, I will consider add priority into the Procedure and find way that Executor can pop high priority issue first > Assign system tables first (priority) > - > > Key: HBASE-18109 > URL: https://issues.apache.org/jira/browse/HBASE-18109 > Project: HBase > Issue Type: Sub-task > Components: Region Assignment >Affects Versions: 2.0.0 >Reporter: stack >Assignee: Yi Liang >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-18109-V1.patch > > > Need this for stuff like the RSGroup table, etc. Assign these ahead of > user-space regions. > From 'Handle sys table assignment first (e.g. acl, namespace, rsgroup); > currently only hbase:meta is first.' of > https://docs.google.com/document/d/1eVKa7FHdeoJ1-9o8yZcOTAQbv0u0bblBlCCzVSIn69g/edit#heading=h.oefcyphs0v0x -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18158: --- Description: {code:title=CompactingMemStore.java|borderStyle=solid} private void stopCompaction() { if (inMemoryFlushInProgress.get()) { compactor.stop(); inMemoryFlushInProgress.set(false); } } {code} The stopCompaction() set inMemoryFlushInProgress to false so there may be two in-memory compaction threads which execute simultaneously. If there are two running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the versionedList. {code:title=MemStoreCompactor.java|borderStyle=solid} public boolean start() throws IOException { if (!compactingMemStore.hasImmutableSegments()) { // no compaction on empty pipeline return false; } // get a snapshot of the list of the segments from the pipeline, // this local copy of the list is marked with specific version versionedList = compactingMemStore.getImmutableSegments(); {code} And the first InMemoryFlushRunnable will use the chagned versionedList to remove the corresponding segments. {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} if (!isInterrupted.get()) { if (resultSwapped = compactingMemStore.swapCompactedSegments( versionedList, result, (action==Action.MERGE))) { // update the wal so it can be truncated and not get too long compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // only if greater } } {code} In conclusion, first InMemoryFlushRunnable will remove the worng segment. And the later InMemoryFlushRunnable may introduce NPE because first InMemoryFlushRunnable set versionedList to null after compaction. {code} Exception in thread "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) at org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) at org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) at org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) {code} was: {code:title=CompactingMemStore.java|borderStyle=solid} private void stopCompaction() { if (inMemoryFlushInProgress.get()) { compactor.stop(); inMemoryFlushInProgress.set(false); } } {code} The stopCompaction() set inMemoryFlushInProgress to false so there may be two in-memory compaction threads which execute simultaneously. If there are two running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the versionedList. {code:title=MemStoreCompactor.java|borderStyle=solid} public boolean start() throws IOException { if (!compactingMemStore.hasImmutableSegments()) { // no compaction on empty pipeline return false; } // get a snapshot of the list of the segments from the pipeline, // this local copy of the list is marked with specific version versionedList = compactingMemStore.getImmutableSegments(); {code} And the first InMemoryFlushRunnable will use the chagned versionedList to remove the corresponding segments. {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} if (!isInterrupted.get()) { if (resultSwapped = compactingMemStore.swapCompactedSegments( versionedList, result, (action==Action.MERGE))) { // update the wal so it can be truncated and not get too long compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // only if greater } } {code} In conclusion, first InMemoryFlushRunnable will remove the worng segment. And the later InMemoryFlushRunnable will introduce NPE because first InMemoryFlushRunnable set versionedList to null after compaction. {code} Exception in thread "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) at org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) at
[jira] [Commented] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036325#comment-16036325 ] Chia-Ping Tsai commented on HBASE-18158: If the compaction thread checks the isInterrupted before swapping the segment in pipeline, the NPE won't happen. bq. Maybe the cause was different in above case. I will keep my eye on TestAcid* after commiting this and HBASE-18145. > Two running in-memory compaction threads may lose data > -- > > Key: HBASE-18158 > URL: https://issues.apache.org/jira/browse/HBASE-18158 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0 > > Attachments: HBASE-18158.v0.patch > > > {code:title=CompactingMemStore.java|borderStyle=solid} > private void stopCompaction() { > if (inMemoryFlushInProgress.get()) { > compactor.stop(); > inMemoryFlushInProgress.set(false); > } > } > {code} > The stopCompaction() set inMemoryFlushInProgress to false so there may be two > in-memory compaction threads which execute simultaneously. If there are two > running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the > versionedList. > {code:title=MemStoreCompactor.java|borderStyle=solid} > public boolean start() throws IOException { > if (!compactingMemStore.hasImmutableSegments()) { // no compaction on > empty pipeline > return false; > } > // get a snapshot of the list of the segments from the pipeline, > // this local copy of the list is marked with specific version > versionedList = compactingMemStore.getImmutableSegments(); > {code} > And the first InMemoryFlushRunnable will use the chagned versionedList to > remove the corresponding segments. > {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} > if (!isInterrupted.get()) { > if (resultSwapped = compactingMemStore.swapCompactedSegments( > versionedList, result, (action==Action.MERGE))) { > // update the wal so it can be truncated and not get too long > compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // > only if greater > } > } > {code} > In conclusion, first InMemoryFlushRunnable will remove the worng segment. And > the later InMemoryFlushRunnable will introduce NPE because first > InMemoryFlushRunnable set versionedList to null after compaction. > {code} > Exception in thread > "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" > java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036320#comment-16036320 ] Ted Yu commented on HBASE-18158: Looking at : https://builds.apache.org/job/HBASE-Flaky-Tests/16769/testReport/junit/org.apache.hadoop.hbase/TestAcidGuarantees/testMobMixedAtomicity_1_/ there was no NPE. Maybe the cause was different in above case. > Two running in-memory compaction threads may lose data > -- > > Key: HBASE-18158 > URL: https://issues.apache.org/jira/browse/HBASE-18158 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0 > > Attachments: HBASE-18158.v0.patch > > > {code:title=CompactingMemStore.java|borderStyle=solid} > private void stopCompaction() { > if (inMemoryFlushInProgress.get()) { > compactor.stop(); > inMemoryFlushInProgress.set(false); > } > } > {code} > The stopCompaction() set inMemoryFlushInProgress to false so there may be two > in-memory compaction threads which execute simultaneously. If there are two > running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the > versionedList. > {code:title=MemStoreCompactor.java|borderStyle=solid} > public boolean start() throws IOException { > if (!compactingMemStore.hasImmutableSegments()) { // no compaction on > empty pipeline > return false; > } > // get a snapshot of the list of the segments from the pipeline, > // this local copy of the list is marked with specific version > versionedList = compactingMemStore.getImmutableSegments(); > {code} > And the first InMemoryFlushRunnable will use the chagned versionedList to > remove the corresponding segments. > {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} > if (!isInterrupted.get()) { > if (resultSwapped = compactingMemStore.swapCompactedSegments( > versionedList, result, (action==Action.MERGE))) { > // update the wal so it can be truncated and not get too long > compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // > only if greater > } > } > {code} > In conclusion, first InMemoryFlushRunnable will remove the worng segment. And > the later InMemoryFlushRunnable will introduce NPE because first > InMemoryFlushRunnable set versionedList to null after compaction. > {code} > Exception in thread > "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" > java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036310#comment-16036310 ] Chia-Ping Tsai commented on HBASE-18158: bq. MyCompactingMemStoreWithCustomeCompactor -> MyCompactingMemStoreWithCustomCompactor copy that. Thanks for the reviews. > Two running in-memory compaction threads may lose data > -- > > Key: HBASE-18158 > URL: https://issues.apache.org/jira/browse/HBASE-18158 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0 > > Attachments: HBASE-18158.v0.patch > > > {code:title=CompactingMemStore.java|borderStyle=solid} > private void stopCompaction() { > if (inMemoryFlushInProgress.get()) { > compactor.stop(); > inMemoryFlushInProgress.set(false); > } > } > {code} > The stopCompaction() set inMemoryFlushInProgress to false so there may be two > in-memory compaction threads which execute simultaneously. If there are two > running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the > versionedList. > {code:title=MemStoreCompactor.java|borderStyle=solid} > public boolean start() throws IOException { > if (!compactingMemStore.hasImmutableSegments()) { // no compaction on > empty pipeline > return false; > } > // get a snapshot of the list of the segments from the pipeline, > // this local copy of the list is marked with specific version > versionedList = compactingMemStore.getImmutableSegments(); > {code} > And the first InMemoryFlushRunnable will use the chagned versionedList to > remove the corresponding segments. > {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} > if (!isInterrupted.get()) { > if (resultSwapped = compactingMemStore.swapCompactedSegments( > versionedList, result, (action==Action.MERGE))) { > // update the wal so it can be truncated and not get too long > compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // > only if greater > } > } > {code} > In conclusion, first InMemoryFlushRunnable will remove the worng segment. And > the later InMemoryFlushRunnable will introduce NPE because first > InMemoryFlushRunnable set versionedList to null after compaction. > {code} > Exception in thread > "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" > java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036309#comment-16036309 ] Ted Yu commented on HBASE-18158: Nice finding. MyCompactingMemStoreWithCustomeCompactor -> MyCompactingMemStoreWithCustomCompactor > Two running in-memory compaction threads may lose data > -- > > Key: HBASE-18158 > URL: https://issues.apache.org/jira/browse/HBASE-18158 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0 > > Attachments: HBASE-18158.v0.patch > > > {code:title=CompactingMemStore.java|borderStyle=solid} > private void stopCompaction() { > if (inMemoryFlushInProgress.get()) { > compactor.stop(); > inMemoryFlushInProgress.set(false); > } > } > {code} > The stopCompaction() set inMemoryFlushInProgress to false so there may be two > in-memory compaction threads which execute simultaneously. If there are two > running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the > versionedList. > {code:title=MemStoreCompactor.java|borderStyle=solid} > public boolean start() throws IOException { > if (!compactingMemStore.hasImmutableSegments()) { // no compaction on > empty pipeline > return false; > } > // get a snapshot of the list of the segments from the pipeline, > // this local copy of the list is marked with specific version > versionedList = compactingMemStore.getImmutableSegments(); > {code} > And the first InMemoryFlushRunnable will use the chagned versionedList to > remove the corresponding segments. > {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} > if (!isInterrupted.get()) { > if (resultSwapped = compactingMemStore.swapCompactedSegments( > versionedList, result, (action==Action.MERGE))) { > // update the wal so it can be truncated and not get too long > compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // > only if greater > } > } > {code} > In conclusion, first InMemoryFlushRunnable will remove the worng segment. And > the later InMemoryFlushRunnable will introduce NPE because first > InMemoryFlushRunnable set versionedList to null after compaction. > {code} > Exception in thread > "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" > java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18158: --- Description: {code:title=CompactingMemStore.java|borderStyle=solid} private void stopCompaction() { if (inMemoryFlushInProgress.get()) { compactor.stop(); inMemoryFlushInProgress.set(false); } } {code} The stopCompaction() set inMemoryFlushInProgress to false so there may be two in-memory compaction threads are executed simultaneously. If there are two running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the versionedList. {code:title=MemStoreCompactor.java|borderStyle=solid} public boolean start() throws IOException { if (!compactingMemStore.hasImmutableSegments()) { // no compaction on empty pipeline return false; } // get a snapshot of the list of the segments from the pipeline, // this local copy of the list is marked with specific version versionedList = compactingMemStore.getImmutableSegments(); {code} And the first InMemoryFlushRunnable will use the chagned versionedList to remove the corresponding segments. {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} if (!isInterrupted.get()) { if (resultSwapped = compactingMemStore.swapCompactedSegments( versionedList, result, (action==Action.MERGE))) { // update the wal so it can be truncated and not get too long compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // only if greater } } {code} In conclusion, first InMemoryFlushRunnable will remove the worng segment. And the later InMemoryFlushRunnable will introduce NPE because first InMemoryFlushRunnable set versionedList to null after compaction. {code} Exception in thread "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) at org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) at org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) at org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) {code} was: {code:title=CompactingMemStore.java|borderStyle=solid} private void stopCompaction() { if (inMemoryFlushInProgress.get()) { compactor.stop(); inMemoryFlushInProgress.set(false); } } {code} The stopCompaction() set inMemoryFlushInProgress to false so there may be two in-memory compaction thread are executed simultaneously. If there are two running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the versionedList. {code:title=MemStoreCompactor.java|borderStyle=solid} public boolean start() throws IOException { if (!compactingMemStore.hasImmutableSegments()) { // no compaction on empty pipeline return false; } // get a snapshot of the list of the segments from the pipeline, // this local copy of the list is marked with specific version versionedList = compactingMemStore.getImmutableSegments(); {code} And the first InMemoryFlushRunnable will use the chagned versionedList to remove the corresponding segments. {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} if (!isInterrupted.get()) { if (resultSwapped = compactingMemStore.swapCompactedSegments( versionedList, result, (action==Action.MERGE))) { // update the wal so it can be truncated and not get too long compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // only if greater } } {code} In conclusion, first InMemoryFlushRunnable will remove the worng segment. And the later InMemoryFlushRunnable will introduce NPE because first InMemoryFlushRunnable set versionedList to null after compaction. {code} Exception in thread "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) at org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) at
[jira] [Updated] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18158: --- Description: {code:title=CompactingMemStore.java|borderStyle=solid} private void stopCompaction() { if (inMemoryFlushInProgress.get()) { compactor.stop(); inMemoryFlushInProgress.set(false); } } {code} The stopCompaction() set inMemoryFlushInProgress to false so there may be two in-memory compaction threads which execute simultaneously. If there are two running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the versionedList. {code:title=MemStoreCompactor.java|borderStyle=solid} public boolean start() throws IOException { if (!compactingMemStore.hasImmutableSegments()) { // no compaction on empty pipeline return false; } // get a snapshot of the list of the segments from the pipeline, // this local copy of the list is marked with specific version versionedList = compactingMemStore.getImmutableSegments(); {code} And the first InMemoryFlushRunnable will use the chagned versionedList to remove the corresponding segments. {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} if (!isInterrupted.get()) { if (resultSwapped = compactingMemStore.swapCompactedSegments( versionedList, result, (action==Action.MERGE))) { // update the wal so it can be truncated and not get too long compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // only if greater } } {code} In conclusion, first InMemoryFlushRunnable will remove the worng segment. And the later InMemoryFlushRunnable will introduce NPE because first InMemoryFlushRunnable set versionedList to null after compaction. {code} Exception in thread "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) at org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) at org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) at org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) {code} was: {code:title=CompactingMemStore.java|borderStyle=solid} private void stopCompaction() { if (inMemoryFlushInProgress.get()) { compactor.stop(); inMemoryFlushInProgress.set(false); } } {code} The stopCompaction() set inMemoryFlushInProgress to false so there may be two in-memory compaction threads are executed simultaneously. If there are two running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the versionedList. {code:title=MemStoreCompactor.java|borderStyle=solid} public boolean start() throws IOException { if (!compactingMemStore.hasImmutableSegments()) { // no compaction on empty pipeline return false; } // get a snapshot of the list of the segments from the pipeline, // this local copy of the list is marked with specific version versionedList = compactingMemStore.getImmutableSegments(); {code} And the first InMemoryFlushRunnable will use the chagned versionedList to remove the corresponding segments. {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} if (!isInterrupted.get()) { if (resultSwapped = compactingMemStore.swapCompactedSegments( versionedList, result, (action==Action.MERGE))) { // update the wal so it can be truncated and not get too long compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // only if greater } } {code} In conclusion, first InMemoryFlushRunnable will remove the worng segment. And the later InMemoryFlushRunnable will introduce NPE because first InMemoryFlushRunnable set versionedList to null after compaction. {code} Exception in thread "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) at org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) at
[jira] [Updated] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18145: --- Status: Patch Available (was: Open) > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch, HBASE-18145.v1.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18145: --- Attachment: HBASE-18145.v1.patch > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch, HBASE-18145.v1.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18158: --- Attachment: HBASE-18158.v0.patch > Two running in-memory compaction threads may lose data > -- > > Key: HBASE-18158 > URL: https://issues.apache.org/jira/browse/HBASE-18158 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai > Fix For: 2.0.0 > > Attachments: HBASE-18158.v0.patch > > > {code:title=CompactingMemStore.java|borderStyle=solid} > private void stopCompaction() { > if (inMemoryFlushInProgress.get()) { > compactor.stop(); > inMemoryFlushInProgress.set(false); > } > } > {code} > The stopCompaction() set inMemoryFlushInProgress to false so there may be two > in-memory compaction thread are executed simultaneously. If there are two > running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the > versionedList. > {code:title=MemStoreCompactor.java|borderStyle=solid} > public boolean start() throws IOException { > if (!compactingMemStore.hasImmutableSegments()) { // no compaction on > empty pipeline > return false; > } > // get a snapshot of the list of the segments from the pipeline, > // this local copy of the list is marked with specific version > versionedList = compactingMemStore.getImmutableSegments(); > {code} > And the first InMemoryFlushRunnable will use the chagned versionedList to > remove the corresponding segments. > {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} > if (!isInterrupted.get()) { > if (resultSwapped = compactingMemStore.swapCompactedSegments( > versionedList, result, (action==Action.MERGE))) { > // update the wal so it can be truncated and not get too long > compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // > only if greater > } > } > {code} > In conclusion, first InMemoryFlushRunnable will remove the worng segment. And > the later InMemoryFlushRunnable will introduce NPE because first > InMemoryFlushRunnable set versionedList to null after compaction. > {code} > Exception in thread > "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" > java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai updated HBASE-18158: --- Status: Patch Available (was: Open) > Two running in-memory compaction threads may lose data > -- > > Key: HBASE-18158 > URL: https://issues.apache.org/jira/browse/HBASE-18158 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0 > > Attachments: HBASE-18158.v0.patch > > > {code:title=CompactingMemStore.java|borderStyle=solid} > private void stopCompaction() { > if (inMemoryFlushInProgress.get()) { > compactor.stop(); > inMemoryFlushInProgress.set(false); > } > } > {code} > The stopCompaction() set inMemoryFlushInProgress to false so there may be two > in-memory compaction thread are executed simultaneously. If there are two > running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the > versionedList. > {code:title=MemStoreCompactor.java|borderStyle=solid} > public boolean start() throws IOException { > if (!compactingMemStore.hasImmutableSegments()) { // no compaction on > empty pipeline > return false; > } > // get a snapshot of the list of the segments from the pipeline, > // this local copy of the list is marked with specific version > versionedList = compactingMemStore.getImmutableSegments(); > {code} > And the first InMemoryFlushRunnable will use the chagned versionedList to > remove the corresponding segments. > {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} > if (!isInterrupted.get()) { > if (resultSwapped = compactingMemStore.swapCompactedSegments( > versionedList, result, (action==Action.MERGE))) { > // update the wal so it can be truncated and not get too long > compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // > only if greater > } > } > {code} > In conclusion, first InMemoryFlushRunnable will remove the worng segment. And > the later InMemoryFlushRunnable will introduce NPE because first > InMemoryFlushRunnable set versionedList to null after compaction. > {code} > Exception in thread > "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" > java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (HBASE-18158) Two running in-memory compaction threads may lose data
[ https://issues.apache.org/jira/browse/HBASE-18158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chia-Ping Tsai reassigned HBASE-18158: -- Assignee: Chia-Ping Tsai > Two running in-memory compaction threads may lose data > -- > > Key: HBASE-18158 > URL: https://issues.apache.org/jira/browse/HBASE-18158 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai > Fix For: 2.0.0 > > Attachments: HBASE-18158.v0.patch > > > {code:title=CompactingMemStore.java|borderStyle=solid} > private void stopCompaction() { > if (inMemoryFlushInProgress.get()) { > compactor.stop(); > inMemoryFlushInProgress.set(false); > } > } > {code} > The stopCompaction() set inMemoryFlushInProgress to false so there may be two > in-memory compaction thread are executed simultaneously. If there are two > running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the > versionedList. > {code:title=MemStoreCompactor.java|borderStyle=solid} > public boolean start() throws IOException { > if (!compactingMemStore.hasImmutableSegments()) { // no compaction on > empty pipeline > return false; > } > // get a snapshot of the list of the segments from the pipeline, > // this local copy of the list is marked with specific version > versionedList = compactingMemStore.getImmutableSegments(); > {code} > And the first InMemoryFlushRunnable will use the chagned versionedList to > remove the corresponding segments. > {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} > if (!isInterrupted.get()) { > if (resultSwapped = compactingMemStore.swapCompactedSegments( > versionedList, result, (action==Action.MERGE))) { > // update the wal so it can be truncated and not get too long > compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // > only if greater > } > } > {code} > In conclusion, first InMemoryFlushRunnable will remove the worng segment. And > the later InMemoryFlushRunnable will introduce NPE because first > InMemoryFlushRunnable set versionedList to null after compaction. > {code} > Exception in thread > "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" > java.lang.NullPointerException > at > org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) > at > org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18145) The flush may cause the corrupt data for reading
[ https://issues.apache.org/jira/browse/HBASE-18145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036308#comment-16036308 ] Chia-Ping Tsai commented on HBASE-18145: The failed TestAcid* may caused by HBASE-18158. Let me submit v1 patch. > The flush may cause the corrupt data for reading > > > Key: HBASE-18145 > URL: https://issues.apache.org/jira/browse/HBASE-18145 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Chia-Ping Tsai >Assignee: Chia-Ping Tsai >Priority: Blocker > Fix For: 2.0.0, 1.4.0, 1.3.2 > > Attachments: HBASE-18145.v0.patch > > > After HBASE-17887, the store scanner closes the memstore scanner in updating > the inner scanners. The chunk which stores the current data may be reclaimed. > So if the chunk is rewrited before we send the data to client, the client > will receive the corrupt data. > This issue also breaks the TestAcid*. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18158) Two running in-memory compaction threads may lose data
Chia-Ping Tsai created HBASE-18158: -- Summary: Two running in-memory compaction threads may lose data Key: HBASE-18158 URL: https://issues.apache.org/jira/browse/HBASE-18158 Project: HBase Issue Type: Bug Affects Versions: 2.0.0 Reporter: Chia-Ping Tsai Fix For: 2.0.0 {code:title=CompactingMemStore.java|borderStyle=solid} private void stopCompaction() { if (inMemoryFlushInProgress.get()) { compactor.stop(); inMemoryFlushInProgress.set(false); } } {code} The stopCompaction() set inMemoryFlushInProgress to false so there may be two in-memory compaction thread are executed simultaneously. If there are two running InMemoryFlushRunnable, the later InMemoryFlushRunnable may change the versionedList. {code:title=MemStoreCompactor.java|borderStyle=solid} public boolean start() throws IOException { if (!compactingMemStore.hasImmutableSegments()) { // no compaction on empty pipeline return false; } // get a snapshot of the list of the segments from the pipeline, // this local copy of the list is marked with specific version versionedList = compactingMemStore.getImmutableSegments(); {code} And the first InMemoryFlushRunnable will use the chagned versionedList to remove the corresponding segments. {code:title=MemStoreCompactor#doCompaction|borderStyle=solid} if (!isInterrupted.get()) { if (resultSwapped = compactingMemStore.swapCompactedSegments( versionedList, result, (action==Action.MERGE))) { // update the wal so it can be truncated and not get too long compactingMemStore.updateLowestUnflushedSequenceIdInWAL(true); // only if greater } } {code} In conclusion, first InMemoryFlushRunnable will remove the worng segment. And the later InMemoryFlushRunnable will introduce NPE because first InMemoryFlushRunnable set versionedList to null after compaction. {code} Exception in thread "RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=45712-inmemoryCompactions-1496563908038" java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.CompactionPipeline.swap(CompactionPipeline.java:119) at org.apache.hadoop.hbase.regionserver.CompactingMemStore.swapCompactedSegments(CompactingMemStore.java:283) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.doCompaction(MemStoreCompactor.java:212) at org.apache.hadoop.hbase.regionserver.MemStoreCompactor.start(MemStoreCompactor.java:122) at org.apache.hadoop.hbase.regionserver.CompactingMemStore.flushInMemory(CompactingMemStore.java:388) at org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable.run(CompactingMemStore.java:500) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-8770) deletes and puts with the same ts should be resolved according to mvcc/seqNum
[ https://issues.apache.org/jira/browse/HBASE-8770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036302#comment-16036302 ] Phil Yang commented on HBASE-8770: -- We are fixing HBASE-15968 which is more complicated because we have delete versions which will delete all Puts whose ts is not larger than it. So the delete marker and put may not be same. But after HBASE-15968 being fixed this issue will be also fixed. > deletes and puts with the same ts should be resolved according to mvcc/seqNum > - > > Key: HBASE-8770 > URL: https://issues.apache.org/jira/browse/HBASE-8770 > Project: HBase > Issue Type: Brainstorming >Reporter: Sergey Shelukhin >Priority: Critical > > This came up during HBASE-8721. Puts with the same ts are resolved by seqNum. > It's not clear why deletes with the same ts as a put should always mask the > put, rather than also being resolve by seqNum. > What do you think? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18157) TestLockProcedure is flaky in master branch
Ted Yu created HBASE-18157: -- Summary: TestLockProcedure is flaky in master branch Key: HBASE-18157 URL: https://issues.apache.org/jira/browse/HBASE-18157 Project: HBase Issue Type: Test Reporter: Ted Yu >From >https://builds.apache.org/job/PreCommit-HBASE-Build/7060/artifact/patchprocess/patch-unit-hbase-server.txt > : {code} testLocalMasterLockRecovery(org.apache.hadoop.hbase.master.locking.TestLockProcedure) Time elapsed: 30.049 sec <<< ERROR! java.lang.Exception: Appears to be stuck in thread ResponseProcessor for block BP-387952765-172.17.0.2-1496577331729:blk_1073741842_1018 at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118) {code} TestLockProcedure is a small test. Its failure would prevent medium / large tests from running. According to https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html , its flaky rate is 33%. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18156) Provide a tool to show block cache summary
[ https://issues.apache.org/jira/browse/HBASE-18156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-18156: --- Summary: Provide a tool to show block cache summary (was: Provide a tool to show cache summary) > Provide a tool to show block cache summary > -- > > Key: HBASE-18156 > URL: https://issues.apache.org/jira/browse/HBASE-18156 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0, 1.4.0 >Reporter: Allan Yang >Assignee: Allan Yang > Attachments: HBASE-18156.patch > > > HBASE-17757 is already committed. But since there is no easy way to show the > size distribution of cached blocks, it is hard to decide the unified size > should be used. > Here I provide a tool to show the details of size distribution of cached > blocks. This tool is well used in our production environment. It is a jsp > page summaries the cache details like this > {code} > BlockCache type:org.apache.hadoop.hbase.io.hfile.LruBlockCache > LruBlockCache > Total size:28.40 GB > Current size:22.49 GB > MetaBlock size:1.56 GB > Free size:5.91 GB > Block count:152684 > Size distribution summary: > BlockCacheSizeDistributionSummary [0 B<=blocksize<4 KB, blocks=833, > heapSize=1.19 MB] > BlockCacheSizeDistributionSummary [4 KB<=blocksize<8 KB, blocks=65, > heapSize=310.83 KB] > BlockCacheSizeDistributionSummary [8 KB<=blocksize<12 KB, blocks=175, > heapSize=1.46 MB] > BlockCacheSizeDistributionSummary [12 KB<=blocksize<16 KB, blocks=18, > heapSize=267.43 KB] > BlockCacheSizeDistributionSummary [16 KB<=blocksize<20 KB, blocks=512, > heapSize=8.30 MB] > BlockCacheSizeDistributionSummary [20 KB<=blocksize<24 KB, blocks=22, > heapSize=499.66 KB] > BlockCacheSizeDistributionSummary [24 KB<=blocksize<28 KB, blocks=24, > heapSize=632.59 KB] > BlockCacheSizeDistributionSummary [28 KB<=blocksize<32 KB, blocks=34, > heapSize=1.02 MB] > BlockCacheSizeDistributionSummary [32 KB<=blocksize<36 KB, blocks=31, > heapSize=1.02 MB] > BlockCacheSizeDistributionSummary [36 KB<=blocksize<40 KB, blocks=22, > heapSize=838.58 KB] > BlockCacheSizeDistributionSummary [40 KB<=blocksize<44 KB, blocks=28, > heapSize=1.15 MB] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18156) Provide a tool to show cache summary
[ https://issues.apache.org/jira/browse/HBASE-18156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036292#comment-16036292 ] Ted Yu commented on HBASE-18156: Remove year in the header of new classes. Add audience annotation for BlockCacheColumnFamilySummary. {code} 93 * @return blocks in the cache {code} blocks -> number of blocks For equals() method, if you don't want to use curly braces, move the return to the end of if () line. {code} 198* hdfs://localhost:51169/user/userid/-ROOT-/70236052/info/3944417774205889744 {code} There is no root table any more - use a different table. Should BlockCacheColumnFamilySummary be named ColumnFamilySummary ? {code} 217 BlockCacheColumnFamilySummary bcse = null; {code} Make variable name better reflect class name. {code} 240* @return new BlockCacheSummaryEntry {code} Looks like BlockCacheSummaryEntry was the former name of the class. There're 6 occurrences of BlockCacheSummaryEntry - please correct them. Add audience annotation for BlockCacheSizeDistributionSummary. > Provide a tool to show cache summary > > > Key: HBASE-18156 > URL: https://issues.apache.org/jira/browse/HBASE-18156 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0, 1.4.0 >Reporter: Allan Yang >Assignee: Allan Yang > Attachments: HBASE-18156.patch > > > HBASE-17757 is already committed. But since there is no easy way to show the > size distribution of cached blocks, it is hard to decide the unified size > should be used. > Here I provide a tool to show the details of size distribution of cached > blocks. This tool is well used in our production environment. It is a jsp > page summaries the cache details like this > {code} > BlockCache type:org.apache.hadoop.hbase.io.hfile.LruBlockCache > LruBlockCache > Total size:28.40 GB > Current size:22.49 GB > MetaBlock size:1.56 GB > Free size:5.91 GB > Block count:152684 > Size distribution summary: > BlockCacheSizeDistributionSummary [0 B<=blocksize<4 KB, blocks=833, > heapSize=1.19 MB] > BlockCacheSizeDistributionSummary [4 KB<=blocksize<8 KB, blocks=65, > heapSize=310.83 KB] > BlockCacheSizeDistributionSummary [8 KB<=blocksize<12 KB, blocks=175, > heapSize=1.46 MB] > BlockCacheSizeDistributionSummary [12 KB<=blocksize<16 KB, blocks=18, > heapSize=267.43 KB] > BlockCacheSizeDistributionSummary [16 KB<=blocksize<20 KB, blocks=512, > heapSize=8.30 MB] > BlockCacheSizeDistributionSummary [20 KB<=blocksize<24 KB, blocks=22, > heapSize=499.66 KB] > BlockCacheSizeDistributionSummary [24 KB<=blocksize<28 KB, blocks=24, > heapSize=632.59 KB] > BlockCacheSizeDistributionSummary [28 KB<=blocksize<32 KB, blocks=34, > heapSize=1.02 MB] > BlockCacheSizeDistributionSummary [32 KB<=blocksize<36 KB, blocks=31, > heapSize=1.02 MB] > BlockCacheSizeDistributionSummary [36 KB<=blocksize<40 KB, blocks=22, > heapSize=838.58 KB] > BlockCacheSizeDistributionSummary [40 KB<=blocksize<44 KB, blocks=28, > heapSize=1.15 MB] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18156) Provide a tool to show cache summary
[ https://issues.apache.org/jira/browse/HBASE-18156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036259#comment-16036259 ] Hadoop QA commented on HBASE-18156: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 28s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 40s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 17s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 38s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 54s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 45s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 36s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 56m 30s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha2. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 32m 48s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s {color} | {color:green} hbase-external-blockcache in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 123m 16s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.locking.TestLockProcedure | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.03.0-ce Server=17.03.0-ce Image:yetus/hbase:757bf37 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12871145/HBASE-18156.patch | | JIRA Issue | HBASE-18156 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 83be0e3c975f 4.8.3-std-1 #1 SMP Fri Oct 21 11:15:43 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / e65d865 | | Default Java | 1.8.0_131 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/7060/artifact/patchprocess/patch-unit-hbase-server.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HBASE-Build/7060/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results |
[jira] [Updated] (HBASE-18156) Provide a tool to show cache summary
[ https://issues.apache.org/jira/browse/HBASE-18156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-18156: --- Attachment: HBASE-18156.patch > Provide a tool to show cache summary > > > Key: HBASE-18156 > URL: https://issues.apache.org/jira/browse/HBASE-18156 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0, 1.4.0 >Reporter: Allan Yang >Assignee: Allan Yang > Attachments: HBASE-18156.patch > > > HBASE-17757 is already committed. But since there is no easy way to show the > size distribution of cached blocks, it is hard to decide the unified size > should be used. > Here I provide a tool to show the details of size distribution of cached > blocks. This tool is well used in our production environment. It is a jsp > page summaries the cache details like this > {code} > BlockCache type:org.apache.hadoop.hbase.io.hfile.LruBlockCache > LruBlockCache > Total size:28.40 GB > Current size:22.49 GB > MetaBlock size:1.56 GB > Free size:5.91 GB > Block count:152684 > Size distribution summary: > BlockCacheSizeDistributionSummary [0 B<=blocksize<4 KB, blocks=833, > heapSize=1.19 MB] > BlockCacheSizeDistributionSummary [4 KB<=blocksize<8 KB, blocks=65, > heapSize=310.83 KB] > BlockCacheSizeDistributionSummary [8 KB<=blocksize<12 KB, blocks=175, > heapSize=1.46 MB] > BlockCacheSizeDistributionSummary [12 KB<=blocksize<16 KB, blocks=18, > heapSize=267.43 KB] > BlockCacheSizeDistributionSummary [16 KB<=blocksize<20 KB, blocks=512, > heapSize=8.30 MB] > BlockCacheSizeDistributionSummary [20 KB<=blocksize<24 KB, blocks=22, > heapSize=499.66 KB] > BlockCacheSizeDistributionSummary [24 KB<=blocksize<28 KB, blocks=24, > heapSize=632.59 KB] > BlockCacheSizeDistributionSummary [28 KB<=blocksize<32 KB, blocks=34, > heapSize=1.02 MB] > BlockCacheSizeDistributionSummary [32 KB<=blocksize<36 KB, blocks=31, > heapSize=1.02 MB] > BlockCacheSizeDistributionSummary [36 KB<=blocksize<40 KB, blocks=22, > heapSize=838.58 KB] > BlockCacheSizeDistributionSummary [40 KB<=blocksize<44 KB, blocks=28, > heapSize=1.15 MB] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HBASE-18156) Provide a tool to show cache summary
Allan Yang created HBASE-18156: -- Summary: Provide a tool to show cache summary Key: HBASE-18156 URL: https://issues.apache.org/jira/browse/HBASE-18156 Project: HBase Issue Type: New Feature Affects Versions: 2.0.0, 1.4.0 Reporter: Allan Yang Assignee: Allan Yang HBASE-17757 is already committed. But since there is no easy way to show the size distribution of cached blocks, it is hard to decide the unified size should be used. Here I provide a tool to show the details of size distribution of cached blocks. This tool is well used in our production environment. It is a jsp page summaries the cache details like this {code} BlockCache type:org.apache.hadoop.hbase.io.hfile.LruBlockCache LruBlockCache Total size:28.40 GB Current size:22.49 GB MetaBlock size:1.56 GB Free size:5.91 GB Block count:152684 Size distribution summary: BlockCacheSizeDistributionSummary [0 B<=blocksize<4 KB, blocks=833, heapSize=1.19 MB] BlockCacheSizeDistributionSummary [4 KB<=blocksize<8 KB, blocks=65, heapSize=310.83 KB] BlockCacheSizeDistributionSummary [8 KB<=blocksize<12 KB, blocks=175, heapSize=1.46 MB] BlockCacheSizeDistributionSummary [12 KB<=blocksize<16 KB, blocks=18, heapSize=267.43 KB] BlockCacheSizeDistributionSummary [16 KB<=blocksize<20 KB, blocks=512, heapSize=8.30 MB] BlockCacheSizeDistributionSummary [20 KB<=blocksize<24 KB, blocks=22, heapSize=499.66 KB] BlockCacheSizeDistributionSummary [24 KB<=blocksize<28 KB, blocks=24, heapSize=632.59 KB] BlockCacheSizeDistributionSummary [28 KB<=blocksize<32 KB, blocks=34, heapSize=1.02 MB] BlockCacheSizeDistributionSummary [32 KB<=blocksize<36 KB, blocks=31, heapSize=1.02 MB] BlockCacheSizeDistributionSummary [36 KB<=blocksize<40 KB, blocks=22, heapSize=838.58 KB] BlockCacheSizeDistributionSummary [40 KB<=blocksize<44 KB, blocks=28, heapSize=1.15 MB] {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (HBASE-18156) Provide a tool to show cache summary
[ https://issues.apache.org/jira/browse/HBASE-18156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-18156: --- Status: Patch Available (was: Open) > Provide a tool to show cache summary > > > Key: HBASE-18156 > URL: https://issues.apache.org/jira/browse/HBASE-18156 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0, 1.4.0 >Reporter: Allan Yang >Assignee: Allan Yang > > HBASE-17757 is already committed. But since there is no easy way to show the > size distribution of cached blocks, it is hard to decide the unified size > should be used. > Here I provide a tool to show the details of size distribution of cached > blocks. This tool is well used in our production environment. It is a jsp > page summaries the cache details like this > {code} > BlockCache type:org.apache.hadoop.hbase.io.hfile.LruBlockCache > LruBlockCache > Total size:28.40 GB > Current size:22.49 GB > MetaBlock size:1.56 GB > Free size:5.91 GB > Block count:152684 > Size distribution summary: > BlockCacheSizeDistributionSummary [0 B<=blocksize<4 KB, blocks=833, > heapSize=1.19 MB] > BlockCacheSizeDistributionSummary [4 KB<=blocksize<8 KB, blocks=65, > heapSize=310.83 KB] > BlockCacheSizeDistributionSummary [8 KB<=blocksize<12 KB, blocks=175, > heapSize=1.46 MB] > BlockCacheSizeDistributionSummary [12 KB<=blocksize<16 KB, blocks=18, > heapSize=267.43 KB] > BlockCacheSizeDistributionSummary [16 KB<=blocksize<20 KB, blocks=512, > heapSize=8.30 MB] > BlockCacheSizeDistributionSummary [20 KB<=blocksize<24 KB, blocks=22, > heapSize=499.66 KB] > BlockCacheSizeDistributionSummary [24 KB<=blocksize<28 KB, blocks=24, > heapSize=632.59 KB] > BlockCacheSizeDistributionSummary [28 KB<=blocksize<32 KB, blocks=34, > heapSize=1.02 MB] > BlockCacheSizeDistributionSummary [32 KB<=blocksize<36 KB, blocks=31, > heapSize=1.02 MB] > BlockCacheSizeDistributionSummary [36 KB<=blocksize<40 KB, blocks=22, > heapSize=838.58 KB] > BlockCacheSizeDistributionSummary [40 KB<=blocksize<44 KB, blocks=28, > heapSize=1.15 MB] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036230#comment-16036230 ] Ted Yu commented on HBASE-18132: Lgtm > Low replication should be checked in period in case of datanode rolling > upgrade > --- > > Key: HBASE-18132 > URL: https://issues.apache.org/jira/browse/HBASE-18132 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.1.10 >Reporter: Allan Yang >Assignee: Allan Yang > Attachments: HBASE-18132-branch-1.patch, > HBASE-18132-branch-1.v2.patch, HBASE-18132-branch-1.v3.patch, > HBASE-18132-branch-1.v4.patch, HBASE-18132.patch > > > For now, we just check low replication of WALs when there is a sync operation > (HBASE-2234), rolling the log if the replica of the WAL is less than > configured. But if the WAL has very little writes or no writes at all, low > replication will not be detected and thus no log will be rolled. > That is a problem when rolling updating datanode, all replica of the WAL with > no writes will be restarted and lead to the WAL file end up with a abnormal > state. Later operation of opening this file will be always failed. > I bring up a patch to check low replication of WALs at a configured period. > When rolling updating datanodes, we just make sure the restart interval time > between two nodes is bigger than the low replication check time, the WAL will > be closed and rolled normally. A UT in the patch will show everything. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-17339) Scan-Memory-First Optimization for Get Operations
[ https://issues.apache.org/jira/browse/HBASE-17339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036219#comment-16036219 ] Edward Bortnikov commented on HBASE-17339: -- Thanks [~eshcar]. Maybe it makes sense to describe the experiment we used to figure out the current implementation, to provide the community with the full picture (smile). We looked at a workload with temporal (rather than spatial) locality, namely writes closely followed by reads. This pattern is quite frequent in pub-sub scenarios. Instead of seeing a performance benefit in reading from MemStore first, we saw nearly 100% cache hit rate, and could not explain it for a while. The lazy evaluation procedure described by [~eshcar] sheds the light. Obviously, explicitly prioritizing reading from MemStore first rather than simply deferring the data fetch from disk could help avoid some access to Bloom filters, just to figure out whether the key has earlier versions on disk. Those accesses could be avoided. The main practical impact is when the BF itself is not in memory, and accessing it triggers I/O. Is that a realistic scenario? We assume that normally, BF's are permanently cached for all HFile's managed by the RS. Dear community - please speak up. Thanks. > Scan-Memory-First Optimization for Get Operations > - > > Key: HBASE-17339 > URL: https://issues.apache.org/jira/browse/HBASE-17339 > Project: HBase > Issue Type: Improvement >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-17339-V01.patch, HBASE-17339-V02.patch, > HBASE-17339-V03.patch, HBASE-17339-V03.patch, HBASE-17339-V04.patch, > HBASE-17339-V05.patch, HBASE-17339-V06.patch, read-latency-mixed-workload.jpg > > > The current implementation of a get operation (to retrieve values for a > specific key) scans through all relevant stores of the region; for each store > both memory components (memstores segments) and disk components (hfiles) are > scanned in parallel. > We suggest to apply an optimization that speculatively scans memory-only > components first and only if the result is incomplete scans both memory and > disk. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (HBASE-18132) Low replication should be checked in period in case of datanode rolling upgrade
[ https://issues.apache.org/jira/browse/HBASE-18132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16036212#comment-16036212 ] Allan Yang commented on HBASE-18132: The failed test are either passed locally or unrelated, [~tedyu] > Low replication should be checked in period in case of datanode rolling > upgrade > --- > > Key: HBASE-18132 > URL: https://issues.apache.org/jira/browse/HBASE-18132 > Project: HBase > Issue Type: Bug >Affects Versions: 1.4.0, 1.1.10 >Reporter: Allan Yang >Assignee: Allan Yang > Attachments: HBASE-18132-branch-1.patch, > HBASE-18132-branch-1.v2.patch, HBASE-18132-branch-1.v3.patch, > HBASE-18132-branch-1.v4.patch, HBASE-18132.patch > > > For now, we just check low replication of WALs when there is a sync operation > (HBASE-2234), rolling the log if the replica of the WAL is less than > configured. But if the WAL has very little writes or no writes at all, low > replication will not be detected and thus no log will be rolled. > That is a problem when rolling updating datanode, all replica of the WAL with > no writes will be restarted and lead to the WAL file end up with a abnormal > state. Later operation of opening this file will be always failed. > I bring up a patch to check low replication of WALs at a configured period. > When rolling updating datanodes, we just make sure the restart interval time > between two nodes is bigger than the low replication check time, the WAL will > be closed and rolled normally. A UT in the patch will show everything. -- This message was sent by Atlassian JIRA (v6.3.15#6346)