[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-16 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128386#comment-13128386
 ] 

Hudson commented on HBASE-3417:
---

Integrated in HBase-TRUNK #2325 (See 
[https://builds.apache.org/job/HBase-TRUNK/2325/])
HBASE-3417  CacheOnWrite is using the temporary output path for block 
names, need to use a more consistent block naming scheme (jgray)

jgray : 
Files : 
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java


 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126378#comment-13126378
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/
---

(Updated 2011-10-13 06:31:13.710051)


Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin.


Changes
---

Applies cleanly.


Summary
---

Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.

The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.


This addresses bug HBASE-3417.
https://issues.apache.org/jira/browse/HBASE-3417


Diffs (updated)
-

  /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1182679 
  /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 
  /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 
  /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1182679 
  /src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1182679 

Diff: https://reviews.apache.org/r/2379/diff


Testing
---

TestFromClientSide and TestCacheOnWrite both working


Thanks,

Jonathan



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126639#comment-13126639
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/#review2548
---

Ship it!


Just a few questions in the below.  I'm good w/ commit.  Would suggest open new 
issue rather than reopen an old one in future -- makes it easier on the fellas 
coming up behind us trying to make some sense of what we did.  Good stuff.


/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
https://reviews.apache.org/r/2379/#comment5719

This is the tmp name?  Thats what you want?   I don't see where it gets set 
to renamed filename.



/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
https://reviews.apache.org/r/2379/#comment5720

Why?  To be more specific?

Oh, I see how about not removing the '-' from uuid?  Is it illegal in 
filesystem?



/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
https://reviews.apache.org/r/2379/#comment5721

This is better.


- Michael


On 2011-10-13 06:31:13, Jonathan Gray wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2379/
bq.  ---
bq.  
bq.  (Updated 2011-10-13 06:31:13)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail 
Bautin.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.
bq.  
bq.  The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.
bq.  
bq.  
bq.  This addresses bug HBASE-3417.
bq.  https://issues.apache.org/jira/browse/HBASE-3417
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1182679 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1182679 
bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1182679 
bq.  
bq.  Diff: https://reviews.apache.org/r/2379/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestFromClientSide and TestCacheOnWrite both working
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jonathan
bq.  
bq.



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126764#comment-13126764
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/#review2555
---



/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
https://reviews.apache.org/r/2379/#comment5731

Can we keep these final and CacheConfig immutable, and create a separate 
instance of CacheConfig during testing instead?



/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
https://reviews.apache.org/r/2379/#comment5732

The same as above.



/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
https://reviews.apache.org/r/2379/#comment5736

Using the approach above we can avoid creating new setters.

If it's not feasible, then appending ForTesting to these method names 
will probably be enough to discourage use of them in production.



/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
https://reviews.apache.org/r/2379/#comment5733

Maybe we can avoid these setters using the approach I described above? If 
not, it should either be package-private, or the name should include something 
like ForTesting.



/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
https://reviews.apache.org/r/2379/#comment5734

The same as above.



/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
https://reviews.apache.org/r/2379/#comment5737

This does not return null anymore in the new version? This will probably 
result in an IOException down the line, is that handled well? We could throw an 
IOException right here as an alternative.



/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
https://reviews.apache.org/r/2379/#comment5738

@Michael: This pattern does not allow dashes in the UUID, it only allows 
0-9 and a-f.



/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
https://reviews.apache.org/r/2379/#comment5739

suffix.length() = 0 is unnecessary.



/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java
https://reviews.apache.org/r/2379/#comment5740

Nice test! This goes all the way from client side to HFile, unlike 
TestCacheOnWrite.


- Mikhail


On 2011-10-13 06:31:13, Jonathan Gray wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2379/
bq.  ---
bq.  
bq.  (Updated 2011-10-13 06:31:13)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail 
Bautin.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.
bq.  
bq.  The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.
bq.  
bq.  
bq.  This addresses bug HBASE-3417.
bq.  https://issues.apache.org/jira/browse/HBASE-3417
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1182679 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1182679 
bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1182679 
bq.  
bq.  Diff: https://reviews.apache.org/r/2379/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestFromClientSide and TestCacheOnWrite both working
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jonathan
bq.  
bq.



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, 

[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126766#comment-13126766
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--



bq.  On 2011-10-13 17:58:45, Mikhail Bautin wrote:
bq.  

Nice fix, Jonathan! For some reason I assumed that HBASE-3417 was part of the 
HFile v2 patch when I submitted the patch, but apparently it was not.


- Mikhail


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/#review2555
---


On 2011-10-13 06:31:13, Jonathan Gray wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2379/
bq.  ---
bq.  
bq.  (Updated 2011-10-13 06:31:13)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail 
Bautin.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.
bq.  
bq.  The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.
bq.  
bq.  
bq.  This addresses bug HBASE-3417.
bq.  https://issues.apache.org/jira/browse/HBASE-3417
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1182679 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1182679 
bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1182679 
bq.  
bq.  Diff: https://reviews.apache.org/r/2379/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestFromClientSide and TestCacheOnWrite both working
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jonathan
bq.  
bq.



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126832#comment-13126832
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--



bq.  On 2011-10-13 14:49:35, Michael Stack wrote:
bq.   /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 496
bq.   https://reviews.apache.org/r/2379/diff/2/?file=49944#file49944line496
bq.  
bq.   This is the tmp name?  Thats what you want?   I don't see where it 
gets set to renamed filename.

The tmp name and perm name are the same.  The tmp is just in .tmp


bq.  On 2011-10-13 14:49:35, Michael Stack wrote:
bq.   /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 
166
bq.   https://reviews.apache.org/r/2379/diff/2/?file=49945#file49945line166
bq.  
bq.   Why?  To be more specific?
bq.   
bq.   Oh, I see how about not removing the '-' from uuid?  Is it 
illegal in filesystem?

I am removing the '-'.  This checks for alphanumerics only.


- Jonathan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/#review2548
---


On 2011-10-13 06:31:13, Jonathan Gray wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2379/
bq.  ---
bq.  
bq.  (Updated 2011-10-13 06:31:13)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail 
Bautin.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.
bq.  
bq.  The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.
bq.  
bq.  
bq.  This addresses bug HBASE-3417.
bq.  https://issues.apache.org/jira/browse/HBASE-3417
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1182679 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1182679 
bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1182679 
bq.  
bq.  Diff: https://reviews.apache.org/r/2379/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestFromClientSide and TestCacheOnWrite both working
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jonathan
bq.  
bq.



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126834#comment-13126834
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/#review2561
---


Thanks for the reviews stack and mikhail!


/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
https://reviews.apache.org/r/2379/#comment5744

I think i'll restore the old behavior of returning null, but if it gets an 
IOE, that will be bubbled up (that should happen for a harder failure than just 
a simple rename failure, in which case, we probably want to bubble it up?)


- Jonathan


On 2011-10-13 06:31:13, Jonathan Gray wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2379/
bq.  ---
bq.  
bq.  (Updated 2011-10-13 06:31:13)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail 
Bautin.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.
bq.  
bq.  The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.
bq.  
bq.  
bq.  This addresses bug HBASE-3417.
bq.  https://issues.apache.org/jira/browse/HBASE-3417
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1182679 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1182679 
bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1182679 
bq.  
bq.  Diff: https://reviews.apache.org/r/2379/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestFromClientSide and TestCacheOnWrite both working
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jonathan
bq.  
bq.



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126838#comment-13126838
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--



bq.  On 2011-10-13 19:24:46, Jonathan Gray wrote:
bq.   /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, lines 
1228-1229
bq.   https://reviews.apache.org/r/2379/diff/2/?file=49944#file49944line1228
bq.  
bq.   I think i'll restore the old behavior of returning null, but if it 
gets an IOE, that will be bubbled up (that should happen for a harder failure 
than just a simple rename failure, in which case, we probably want to bubble it 
up?)

Sounds good. By the way, is a null return value handled correctly upstream? The 
whole compaction probably needs to be considered failed in that case.


- Mikhail


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/#review2561
---


On 2011-10-13 06:31:13, Jonathan Gray wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2379/
bq.  ---
bq.  
bq.  (Updated 2011-10-13 06:31:13)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail 
Bautin.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.
bq.  
bq.  The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.
bq.  
bq.  
bq.  This addresses bug HBASE-3417.
bq.  https://issues.apache.org/jira/browse/HBASE-3417
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1182679 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1182679 
bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1182679 
bq.  
bq.  Diff: https://reviews.apache.org/r/2379/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestFromClientSide and TestCacheOnWrite both working
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jonathan
bq.  
bq.



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126839#comment-13126839
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--



bq.  On 2011-10-13 19:24:46, Jonathan Gray wrote:
bq.   /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, lines 
1228-1229
bq.   https://reviews.apache.org/r/2379/diff/2/?file=49944#file49944line1228
bq.  
bq.   I think i'll restore the old behavior of returning null, but if it 
gets an IOE, that will be bubbled up (that should happen for a harder failure 
than just a simple rename failure, in which case, we probably want to bubble it 
up?)
bq.  
bq.  Mikhail Bautin wrote:
bq.  Sounds good. By the way, is a null return value handled correctly 
upstream? The whole compaction probably needs to be considered failed in that 
case.

Guess I should have looked at that.  Seems that it is not handled.

I'm going to change the behavior to throw an IOException rather than return 
null.

I will kick off the test suite before committing.


- Jonathan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/#review2561
---


On 2011-10-13 06:31:13, Jonathan Gray wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2379/
bq.  ---
bq.  
bq.  (Updated 2011-10-13 06:31:13)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail 
Bautin.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.
bq.  
bq.  The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.
bq.  
bq.  
bq.  This addresses bug HBASE-3417.
bq.  https://issues.apache.org/jira/browse/HBASE-3417
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1182679 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1182679 
bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1182679 
bq.  
bq.  Diff: https://reviews.apache.org/r/2379/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestFromClientSide and TestCacheOnWrite both working
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jonathan
bq.  
bq.



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126842#comment-13126842
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/#review2564
---



/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
https://reviews.apache.org/r/2379/#comment5748

I like this suggestion... keeping config immutable and go make a new one if 
you need to change something... w/ all final.


- Michael


On 2011-10-13 06:31:13, Jonathan Gray wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2379/
bq.  ---
bq.  
bq.  (Updated 2011-10-13 06:31:13)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail 
Bautin.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.
bq.  
bq.  The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.
bq.  
bq.  
bq.  This addresses bug HBASE-3417.
bq.  https://issues.apache.org/jira/browse/HBASE-3417
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1182679 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1182679 
bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1182679 
bq.  
bq.  Diff: https://reviews.apache.org/r/2379/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestFromClientSide and TestCacheOnWrite both working
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jonathan
bq.  
bq.



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126841#comment-13126841
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/
---

(Updated 2011-10-13 19:33:02.273799)


Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin.


Changes
---

Changes from Mikhail review.  Changes it so if we fail rename we throw an 
exception, otherwise it looks like we would eventually get an NPE down the 
line?  There's no special casing for a null return.

Will run test suite before committing.


Summary
---

Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.

The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.


This addresses bug HBASE-3417.
https://issues.apache.org/jira/browse/HBASE-3417


Diffs (updated)
-

  /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1183046 
  /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1183046 
  /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183046 
  /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183046 
  /src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1183046 

Diff: https://reviews.apache.org/r/2379/diff


Testing
---

TestFromClientSide and TestCacheOnWrite both working


Thanks,

Jonathan



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126847#comment-13126847
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/#review2565
---



/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
https://reviews.apache.org/r/2379/#comment5749

@Mikhail yeah.   I figured it out.  I was thinking we should allow 
dashes... then it looks like a uuid and makes compare  by human easier.


- Michael


On 2011-10-13 19:33:02, Jonathan Gray wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2379/
bq.  ---
bq.  
bq.  (Updated 2011-10-13 19:33:02)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail 
Bautin.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.
bq.  
bq.  The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.
bq.  
bq.  
bq.  This addresses bug HBASE-3417.
bq.  https://issues.apache.org/jira/browse/HBASE-3417
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1183046 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1183046 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183046 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1183046 
bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1183046 
bq.  
bq.  Diff: https://reviews.apache.org/r/2379/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestFromClientSide and TestCacheOnWrite both working
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jonathan
bq.  
bq.



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126848#comment-13126848
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--



bq.  On 2011-10-13 19:31:38, Michael Stack wrote:
bq.   /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java, line 
100
bq.   https://reviews.apache.org/r/2379/diff/2/?file=49943#file49943line100
bq.  
bq.   I like this suggestion... keeping config immutable and go make a new 
one if you need to change something... w/ all final.

I like the idea too, but it's more clunky.  Then CacheConfig can't be final in 
things like Store and it becomes not very useful in other tests.  I'm going to 
make the test method names more explicit.


- Jonathan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/#review2564
---


On 2011-10-13 19:33:02, Jonathan Gray wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2379/
bq.  ---
bq.  
bq.  (Updated 2011-10-13 19:33:02)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail 
Bautin.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.
bq.  
bq.  The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.
bq.  
bq.  
bq.  This addresses bug HBASE-3417.
bq.  https://issues.apache.org/jira/browse/HBASE-3417
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1183046 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1183046 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183046 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1183046 
bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1183046 
bq.  
bq.  Diff: https://reviews.apache.org/r/2379/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestFromClientSide and TestCacheOnWrite both working
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jonathan
bq.  
bq.



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126851#comment-13126851
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/
---

(Updated 2011-10-13 19:36:33.845621)


Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin.


Changes
---

Renamed setters to be prefixed with:  forTestsOnly_

(I'm still opting to keep it this way because I know that the next set of tests 
I write for the compressed cache stuff is going to want to do the same thing)


Summary
---

Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.

The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.


This addresses bug HBASE-3417.
https://issues.apache.org/jira/browse/HBASE-3417


Diffs (updated)
-

  /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1183049 
  /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1183049 
  /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183049 
  /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183049 
  /src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1183049 

Diff: https://reviews.apache.org/r/2379/diff


Testing
---

TestFromClientSide and TestCacheOnWrite both working


Thanks,

Jonathan



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126853#comment-13126853
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--



bq.  On 2011-10-13 19:33:27, Michael Stack wrote:
bq.   /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 
166
bq.   https://reviews.apache.org/r/2379/diff/2/?file=49945#file49945line166
bq.  
bq.   @Mikhail yeah.   I figured it out.  I was thinking we should 
allow dashes... then it looks like a uuid and makes compare  by human easier.

we had this argument months ago.  there's a good consistency in the UIs w/ the 
current region encoded names (used as dir names) w/ these uuid (minus -) names. 
 The dashes makes mismatched (as I recall my thoughts on the matter way back 
when it originally had dashes).

i stand by the no dashes!  :)


- Jonathan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/#review2565
---


On 2011-10-13 19:36:33, Jonathan Gray wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2379/
bq.  ---
bq.  
bq.  (Updated 2011-10-13 19:36:33)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail 
Bautin.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.
bq.  
bq.  The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.
bq.  
bq.  
bq.  This addresses bug HBASE-3417.
bq.  https://issues.apache.org/jira/browse/HBASE-3417
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1183049 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1183049 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183049 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1183049 
bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1183049 
bq.  
bq.  Diff: https://reviews.apache.org/r/2379/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestFromClientSide and TestCacheOnWrite both working
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jonathan
bq.  
bq.



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126856#comment-13126856
 ] 

jirapos...@reviews.apache.org commented on HBASE-3417:
--



bq.  On 2011-10-13 19:33:27, Michael Stack wrote:
bq.   /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 
166
bq.   https://reviews.apache.org/r/2379/diff/2/?file=49945#file49945line166
bq.  
bq.   @Mikhail yeah.   I figured it out.  I was thinking we should 
allow dashes... then it looks like a uuid and makes compare  by human easier.
bq.  
bq.  Jonathan Gray wrote:
bq.  we had this argument months ago.  there's a good consistency in the 
UIs w/ the current region encoded names (used as dir names) w/ these uuid 
(minus -) names.  The dashes makes mismatched (as I recall my thoughts on the 
matter way back when it originally had dashes).
bq.  
bq.  i stand by the no dashes!  :)

I took a look.  The UUID ends up being the key in a LRU cache.  I'm fine w/ it 
(I argued for keeping them back then too... sorry for rehearsing old arg)


- Michael


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2379/#review2565
---


On 2011-10-13 19:36:33, Jonathan Gray wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2379/
bq.  ---
bq.  
bq.  (Updated 2011-10-13 19:36:33)
bq.  
bq.  
bq.  Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail 
Bautin.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Adds a new test that fails w/o this fix and changes the naming scheme for 
storefiles to use UUID instead of random longs as ascii.
bq.  
bq.  The big change is that the name of the tmp file used when flushing and 
compacting is the same name (but different dir) when moved in place.  This 
makes it so block names are consistent for COW.
bq.  
bq.  
bq.  This addresses bug HBASE-3417.
bq.  https://issues.apache.org/jira/browse/HBASE-3417
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 
1183049 
bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1183049 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183049 
bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 
1183049 
bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 
1183049 
bq.  
bq.  Diff: https://reviews.apache.org/r/2379/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  TestFromClientSide and TestCacheOnWrite both working
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Jonathan
bq.  
bq.



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread Jonathan Gray (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127018#comment-13127018
 ] 

Jonathan Gray commented on HBASE-3417:
--

Stack, I was going to open a new JIRA, but it is the exact same issue and a 
nearly identical patch (primary difference is pre/post hfile v2).  It was just 
incorrect to close this following commit of hfile v2 which was unrelated to 
this bug.  Nothing was ever committed under this JIRA so just reopened with an 
updated patch.

I think things get confusing when there is more than one commit per branch per 
jira.  We should probably ban that practice.  Or at least institute some kind 
of standardized commit message (HBASE-3417, HBASE-3417-B, HBASE-3417-C, etc) or 
some such thing.

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread Jonathan Gray (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127051#comment-13127051
 ] 

Jonathan Gray commented on HBASE-3417:
--

I didn't mark as incompatible but it is only one-way compatible.

There is actually a very trivial change that can be made in the 0.90 branch (or 
any other branches) to make this change compatible in all directions.  Just 
need to update the REF_NAME_PARSER regex to be what it is in this change 
(tolerant of [a-f] in addition to digits).  That's it.

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread Jonathan Gray (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127054#comment-13127054
 ] 

Jonathan Gray commented on HBASE-3417:
--

In StoreFile.java:
{code}
   private static final Pattern REF_NAME_PARSER =
-Pattern.compile(^(\\d+)(?:\\.(.+))?$);
+Pattern.compile(^([0-9a-f]+)(?:\\.(.+))?$);
{code}

If you ever need to go backwards from 92 to a previous version.

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-10-13 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127197#comment-13127197
 ] 

Hudson commented on HBASE-3417:
---

Integrated in HBase-0.92 #63 (See 
[https://builds.apache.org/job/HBase-0.92/63/])
HBASE-3417  CacheOnWrite is using the temporary output path for block 
names, need to use a more consistent block naming scheme (jgray)

jgray : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
* 
/hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java


 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, 
 HBASE-3417-v2.patch, HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-07-21 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069204#comment-13069204
 ] 

Jonathan Gray commented on HBASE-3417:
--

It does support COW but if it doesn't include changes to how files are named, 
it will still need this fix.  Will follow-up with Mikhail.

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, 
 HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-07-21 Thread Mikhail Bautin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069216#comment-13069216
 ] 

Mikhail Bautin commented on HBASE-3417:
---

@stack: This patch is already part of HFile-v2. Thanks!

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, 
 HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-03-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009396#comment-13009396
 ] 

stack commented on HBASE-3417:
--

You running this patch still Jon or did you do something else?

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, 
 HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-03-21 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009449#comment-13009449
 ] 

Jonathan Gray commented on HBASE-3417:
--

Just verified that this is the same as what we have been running with in 
production (since the patch was put up in January).

I'm ready to commit if you want to +1 me :)

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, 
 HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-03-21 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009537#comment-13009537
 ] 

stack commented on HBASE-3417:
--

+1 on patch for 0.92 (But don't include the TestResettingCounters pollution).

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, 
 HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-01-06 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978427#action_12978427
 ] 

stack commented on HBASE-3417:
--

Should you add in A-Z in below just in case?

{code}
+Pattern.compile(^([0-9a-f]+)(?:\\.(.+))?$);
{code}

Yeah, don't replace the '-' I'd say:

{code}
+return new Path(dir, UUID.randomUUID().toString().replaceAll(-, )
++ ((suffix == null || suffix.length() = 0) ?  : suffix));
{code}

Then its easy to go back to UUID.  You might want to do that so you can use the 
128 bits as key in LRU rather than String?

Otherwise, +1 IFF verified backward compatible.




 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, 
 HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-01-06 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978431#action_12978431
 ] 

Jonathan Gray commented on HBASE-3417:
--

bq. Should you add in A-Z in below just in case?
Could add A-F (uuid is hex chars only), but it's unnecessary.

bq. Then its easy to go back to UUID. You might want to do that so you can use 
the 128 bits as key in LRU rather than String?
LRU uses a String for block name.  I think it looks much nicer with a 
consistent looking naming scheme for region directories and storefiles.  And I 
don't think we need to be overly concerned about the size... If 64K block, in 
the LRU we're talking about 0.05% overhead (or like 0.02% over a more compact 
version).

Also, traditional GUID format reminds me of Microsoft SQL Server :)

This latest v5 patch is being deployed on a 100 node cluster with existing data 
tonight.  Will commit once verified that it's working there.

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, 
 HBASE-3417-v5.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-01-05 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977943#action_12977943
 ] 

Jonathan Gray commented on HBASE-3417:
--

One idea from discussion with stack is to use a UUID for the filename.  That 
way we can generate it once for the temporary file and then just move it in 
place without doing a rename.  Would then just use UUID + blockNumber as the 
blockName.

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-01-05 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978122#action_12978122
 ] 

stack commented on HBASE-3417:
--

As discussed up on IRC, this is not backward compatible:

{code}
+Pattern.compile(^(\\w{32})(?:\\.(.+))?$);
{code}

You can do a range IIRC 20-32 (was old length 20 chars?)

The below is a little bit messy.:

{code}
+return new Path(dir, UUID.randomUUID().toString().replaceAll(-, )
++ ((suffix == null || suffix.length() = 0) ?  : suffix));
{code}

Up on IRC, was thinking should base64 because then it'd be more compact.   See 
http://stackoverflow.com/questions/772802/storing-uuid-as-base64-string.  There 
is also in hbase util a Base64#encodeBytes method that will take the 128 UUID 
bits and emit them as base64 (Possible to get it all down to 22 chars).  But 
looking at the base64 vocabulary, http://en.wikipedia.org/wiki/Base64, it 
includes '+' and '/' which are illegal in URL, a hdfs filepath.  Base32?  
http://en.wikipedia.org/wiki/Base32? But that won't work either.  Has to be 
multiples of 40 bits.

Maybe leave it as it comes out of UUID.toString w/ hyphens.  Then its plain its 
a UUID and its easier to read?



 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-01-05 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978124#action_12978124
 ] 

Jonathan Gray commented on HBASE-3417:
--

I changed regex to be {{([0-9a-z]+)}}

I kind of like how it is.  It looks just like the encoded region names used for 
region directory names.

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme

2011-01-05 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978125#action_12978125
 ] 

Jonathan Gray commented on HBASE-3417:
--

Old random file name was using rand.nextLong() so it could be any length = 1.

 CacheOnWrite is using the temporary output path for block names, need to use 
 a more consistent block naming scheme
 --

 Key: HBASE-3417
 URL: https://issues.apache.org/jira/browse/HBASE-3417
 Project: HBase
  Issue Type: Bug
  Components: io, regionserver
Affects Versions: 0.92.0
Reporter: Jonathan Gray
Assignee: Jonathan Gray
Priority: Critical
 Fix For: 0.92.0

 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch


 Currently the block names used in the block cache are built using the 
 filesystem path.  However, for cache on write, the path is a temporary output 
 file.
 The original COW patch actually made some modifications to block naming stuff 
 to make it more consistent but did not do enough.  Should add a separate 
 method somewhere for generating block names using some more easily mocked 
 scheme (rather than just raw path as we generate a random unique file name 
 twice, once for tmp and then again when moved into place).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.