[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813745#comment-13813745
 ] 

Hudson commented on HBASE-8942:
---

SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #826 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/826/])
HBASE-8942 DFS errors during a read operation (get/scan), may cause write 
outliers (stack: rev 1538867)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java


 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.094.txt, 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813757#comment-13813757
 ] 

Hudson commented on HBASE-8942:
---

FAILURE: Integrated in hbase-0.96-hadoop2 #113 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/113/])
HBASE-8942 DFS errors during a read operation (get/scan), may cause write 
outliers (stack: rev 1538868)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java


 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.094.txt, 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813843#comment-13813843
 ] 

Hudson commented on HBASE-8942:
---

SUCCESS: Integrated in hbase-0.96 #180 (See 
[https://builds.apache.org/job/hbase-0.96/180/])
HBASE-8942 DFS errors during a read operation (get/scan), may cause write 
outliers (stack: rev 1538868)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java


 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.094.txt, 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813852#comment-13813852
 ] 

Hudson commented on HBASE-8942:
---

SUCCESS: Integrated in HBase-TRUNK #4668 (See 
[https://builds.apache.org/job/HBase-TRUNK/4668/])
HBASE-8942 DFS errors during a read operation (get/scan), may cause write 
outliers (stack: rev 1538867)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java


 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.094.txt, 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814207#comment-13814207
 ] 

Hudson commented on HBASE-8942:
---

SUCCESS: Integrated in HBase-0.94-security #329 (See 
[https://builds.apache.org/job/HBase-0.94-security/329/])
HBASE-8942 DFS errors during a read operation (get/scan), may cause write 
outliers; REVERT (stack: rev 1538869)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java


 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.094.txt, 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-05 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13814230#comment-13814230
 ] 

Hudson commented on HBASE-8942:
---

FAILURE: Integrated in HBase-0.94 #1195 (See 
[https://builds.apache.org/job/HBase-0.94/1195/])
HBASE-8942 DFS errors during a read operation (get/scan), may cause write 
outliers; REVERT (stack: rev 1538869)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java


 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.094.txt, 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-04 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812669#comment-13812669
 ] 

Liang Xie commented on HBASE-8942:
--

seems the diff makes the TestHRegion case unstable a bit.

for i in {1..10};do mvn clean test -P localTests 
-Dtest=TestHRegion#testParallelAppendWithMemStoreFlush  /tmp/${i}; done
it shows all are passed on my desktop.

butfor i in {1..10};do mvn clean test -P localTests -Dtest=TestHRegion  
/tmp/${i}; done
it shows 3 of 10 failed.

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-04 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812714#comment-13812714
 ] 

Liang Xie commented on HBASE-8942:
--

testParallelAppendWithMemStoreFlush case was introduced by HBASE-6210
the failure means data will be lost probably.

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-04 Thread Amitanand Aiyer (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813119#comment-13813119
 ] 

Amitanand Aiyer commented on HBASE-8942:


Hey Liang, thanks for pointing the issue out. We will try to port the test and 
runthrough it.

Another issue that we have recently seen is that.
This diff exposes some DFS errors during RegionScanner creation... if the 
compaction deletes one of the files when the scanner is created, before the 
scanner is registered.

https://issues.apache.org/jira/browse/HBASE-9889 should fix that issue.

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-04 Thread Amitanand Aiyer (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813157#comment-13813157
 ] 

Amitanand Aiyer commented on HBASE-8942:


Seems like the testcase is doing some append operations. This is not available 
on 0.89, so unable to port the test back.

Will probably just focus on the open source trunk failures.

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-04 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813330#comment-13813330
 ] 

Ted Yu commented on HBASE-8942:
---

From the following call stack (trunk), I don't see where readLock is grabbed:
{code}
HStore.getScanner(Scan, NavigableSetbyte[], long) line: 1683
HRegion$RegionScannerImpl.init(Scan, ListKeyValueScanner, HRegion) line: 
3427
HRegion.instantiateRegionScanner(Scan, ListKeyValueScanner) line: 1746
HRegion.getScanner(Scan, ListKeyValueScanner) line: 1738
HRegion.getScanner(Scan) line: 1715
TestHRegionBusyWait(TestHRegion).testWritesWhileScanning() line: 2914
{code}

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813506#comment-13813506
 ] 

Lars Hofhansl commented on HBASE-8942:
--

The Store's readlock would be acquired inside. I do not think this is the issue.

[~xieliang007], do you still see the issue with this patch reverted? From 
inspecting the code and the patch I do not see anything wrong with this.

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813516#comment-13813516
 ] 

Lars Hofhansl commented on HBASE-8942:
--

[~stack], FYI. Might have to revert.

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-04 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813542#comment-13813542
 ] 

Liang Xie commented on HBASE-8942:
--

[~lhofhansl], i could not repro the failure after reverted.  i can ensure the 
case failure was caused by this jira/diff definitely.  Let's revert it now, 
[~lhofhansl], [~saint@gmail.com].
I'll dig it per [~amitanand]'s comment.

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-04 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813658#comment-13813658
 ] 

stack commented on HBASE-8942:
--

Reverted for now from trunk, 0.96, and 0.94.  Thanks for figuring this the 
culprit lads (though looking at it, it looks good to me -- and a nice fix to 
have).

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.094.txt, 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-04 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813670#comment-13813670
 ] 

Lars Hofhansl commented on HBASE-8942:
--

Thanks Stack. You beat me to it :)

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.094.txt, 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-04 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13813704#comment-13813704
 ] 

Liang Xie commented on HBASE-8942:
--

Reran this diff combined with HBASE-9889's, still repro the above failure 
successfully.
I need to dive into the detail code to find the root cause now:)

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.094.txt, 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812292#comment-13812292
 ] 

Hudson commented on HBASE-8942:
---

FAILURE: Integrated in hbase-0.96 #178 (See 
[https://builds.apache.org/job/hbase-0.96/178/])
HBASE-8942 DFS errors during a read operation (get/scan), may cause write 
outliers (stack: rev 1538318)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java


 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812319#comment-13812319
 ] 

Hudson commented on HBASE-8942:
---

SUCCESS: Integrated in HBase-TRUNK #4665 (See 
[https://builds.apache.org/job/HBase-TRUNK/4665/])
HBASE-8942 DFS errors during a read operation (get/scan), may cause write 
outliers (stack: rev 1538317)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java


 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812340#comment-13812340
 ] 

Hudson commented on HBASE-8942:
---

FAILURE: Integrated in hbase-0.96-hadoop2 #112 (See 
[https://builds.apache.org/job/hbase-0.96-hadoop2/112/])
HBASE-8942 DFS errors during a read operation (get/scan), may cause write 
outliers (stack: rev 1538318)
* 
/hbase/branches/0.96/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java


 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812355#comment-13812355
 ] 

Hudson commented on HBASE-8942:
---

SUCCESS: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #824 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/824/])
HBASE-8942 DFS errors during a read operation (get/scan), may cause write 
outliers (stack: rev 1538317)
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStore.java


 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812556#comment-13812556
 ] 

Lars Hofhansl commented on HBASE-8942:
--

Checked the 0.94 code. Should be safe there as well. Good find.

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-03 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812557#comment-13812557
 ] 

Lars Hofhansl commented on HBASE-8942:
--

Committed to 0.94 as well.

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812610#comment-13812610
 ] 

Hudson commented on HBASE-8942:
---

FAILURE: Integrated in HBase-0.94-security #328 (See 
[https://builds.apache.org/job/HBase-0.94-security/328/])
HBASE-8942 DFS errors during a read operation (get/scan), may cause write 
outliers (Amitanand Aiyer) (larsh: rev 1538484)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java


 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13812621#comment-13812621
 ] 

Hudson commented on HBASE-8942:
---

SUCCESS: Integrated in HBase-0.94 #1194 (See 
[https://builds.apache.org/job/HBase-0.94/1194/])
HBASE-8942 DFS errors during a read operation (get/scan), may cause write 
outliers (Amitanand Aiyer) (larsh: rev 1538484)
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java


 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb, 0.98.0, 0.96.1, 0.94.14

 Attachments: 8942.096.txt, HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-01 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811131#comment-13811131
 ] 

Liang Xie commented on HBASE-8942:
--

thanks [~amitanand]

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb

 Attachments: HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HBASE-8942) DFS errors during a read operation (get/scan), may cause write outliers

2013-11-01 Thread Liang Xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13811132#comment-13811132
 ] 

Liang Xie commented on HBASE-8942:
--

[~lhofhansl], would you like to bring it into 0.94 branch as well?  seems a low 
risk improvement:)

 DFS errors during a read operation (get/scan), may cause write outliers
 ---

 Key: HBASE-8942
 URL: https://issues.apache.org/jira/browse/HBASE-8942
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.89-fb, 0.95.2
Reporter: Amitanand Aiyer
Assignee: Amitanand Aiyer
Priority: Minor
 Fix For: 0.89-fb

 Attachments: HBase-8942.txt


 This is a similar issue as discussed in HBASE-8228
 1) A scanner holds the Store.ReadLock() while opening the store files ... 
 encounters errors. Thus, takes a long time to finish.
 2) A flush is completed, in the mean while. It needs the write lock to 
 commit(), and update scanners. Hence ends up waiting.
 3+) All Puts (and also Gets) to the CF, which will need a read lock, will 
 have to wait for 1) and 2) to complete. Thus blocking updates to the system 
 for the DFS timeout.
 Fix:
  Open Store files outside the read lock. getScanners() already tries to do 
 this optimisation. However, Store.getScanner() which calls this functions 
 through the StoreScanner constructor, redundantly tries to grab the readLock. 
 Causing the readLock to be held while the storeFiles are being opened, and 
 seeked.
  We should get rid of the readLock() in Store.getScanner(). This is not 
 required. The constructor for StoreScanner calls getScanners(xxx, xxx, xxx). 
 This has the required locking already.



--
This message was sent by Atlassian JIRA
(v6.1#6144)