[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128386#comment-13128386 ] Hudson commented on HBASE-3417: --- Integrated in HBase-TRUNK #2325 (See [https://builds.apache.org/job/HBase-TRUNK/2325/]) HBASE-3417 CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme (jgray) jgray : Files : * /hbase/trunk/CHANGES.txt * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126378#comment-13126378 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/ --- (Updated 2011-10-13 06:31:13.710051) Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. Changes --- Applies cleanly. Summary --- Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. This addresses bug HBASE-3417. https://issues.apache.org/jira/browse/HBASE-3417 Diffs (updated) - /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1182679 /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1182679 /src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1182679 Diff: https://reviews.apache.org/r/2379/diff Testing --- TestFromClientSide and TestCacheOnWrite both working Thanks, Jonathan CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126639#comment-13126639 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/#review2548 --- Ship it! Just a few questions in the below. I'm good w/ commit. Would suggest open new issue rather than reopen an old one in future -- makes it easier on the fellas coming up behind us trying to make some sense of what we did. Good stuff. /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java https://reviews.apache.org/r/2379/#comment5719 This is the tmp name? Thats what you want? I don't see where it gets set to renamed filename. /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2379/#comment5720 Why? To be more specific? Oh, I see how about not removing the '-' from uuid? Is it illegal in filesystem? /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2379/#comment5721 This is better. - Michael On 2011-10-13 06:31:13, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2379/ bq. --- bq. bq. (Updated 2011-10-13 06:31:13) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. bq. bq. bq. Summary bq. --- bq. bq. Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. bq. bq. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. bq. bq. bq. This addresses bug HBASE-3417. bq. https://issues.apache.org/jira/browse/HBASE-3417 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1182679 bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1182679 bq. bq. Diff: https://reviews.apache.org/r/2379/diff bq. bq. bq. Testing bq. --- bq. bq. TestFromClientSide and TestCacheOnWrite both working bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126764#comment-13126764 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/#review2555 --- /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java https://reviews.apache.org/r/2379/#comment5731 Can we keep these final and CacheConfig immutable, and create a separate instance of CacheConfig during testing instead? /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java https://reviews.apache.org/r/2379/#comment5732 The same as above. /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java https://reviews.apache.org/r/2379/#comment5736 Using the approach above we can avoid creating new setters. If it's not feasible, then appending ForTesting to these method names will probably be enough to discourage use of them in production. /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java https://reviews.apache.org/r/2379/#comment5733 Maybe we can avoid these setters using the approach I described above? If not, it should either be package-private, or the name should include something like ForTesting. /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java https://reviews.apache.org/r/2379/#comment5734 The same as above. /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java https://reviews.apache.org/r/2379/#comment5737 This does not return null anymore in the new version? This will probably result in an IOException down the line, is that handled well? We could throw an IOException right here as an alternative. /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2379/#comment5738 @Michael: This pattern does not allow dashes in the UUID, it only allows 0-9 and a-f. /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2379/#comment5739 suffix.length() = 0 is unnecessary. /src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java https://reviews.apache.org/r/2379/#comment5740 Nice test! This goes all the way from client side to HFile, unlike TestCacheOnWrite. - Mikhail On 2011-10-13 06:31:13, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2379/ bq. --- bq. bq. (Updated 2011-10-13 06:31:13) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. bq. bq. bq. Summary bq. --- bq. bq. Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. bq. bq. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. bq. bq. bq. This addresses bug HBASE-3417. bq. https://issues.apache.org/jira/browse/HBASE-3417 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1182679 bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1182679 bq. bq. Diff: https://reviews.apache.org/r/2379/diff bq. bq. bq. Testing bq. --- bq. bq. TestFromClientSide and TestCacheOnWrite both working bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write,
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126766#comment-13126766 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- bq. On 2011-10-13 17:58:45, Mikhail Bautin wrote: bq. Nice fix, Jonathan! For some reason I assumed that HBASE-3417 was part of the HFile v2 patch when I submitted the patch, but apparently it was not. - Mikhail --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/#review2555 --- On 2011-10-13 06:31:13, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2379/ bq. --- bq. bq. (Updated 2011-10-13 06:31:13) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. bq. bq. bq. Summary bq. --- bq. bq. Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. bq. bq. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. bq. bq. bq. This addresses bug HBASE-3417. bq. https://issues.apache.org/jira/browse/HBASE-3417 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1182679 bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1182679 bq. bq. Diff: https://reviews.apache.org/r/2379/diff bq. bq. bq. Testing bq. --- bq. bq. TestFromClientSide and TestCacheOnWrite both working bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126832#comment-13126832 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- bq. On 2011-10-13 14:49:35, Michael Stack wrote: bq. /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, line 496 bq. https://reviews.apache.org/r/2379/diff/2/?file=49944#file49944line496 bq. bq. This is the tmp name? Thats what you want? I don't see where it gets set to renamed filename. The tmp name and perm name are the same. The tmp is just in .tmp bq. On 2011-10-13 14:49:35, Michael Stack wrote: bq. /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 166 bq. https://reviews.apache.org/r/2379/diff/2/?file=49945#file49945line166 bq. bq. Why? To be more specific? bq. bq. Oh, I see how about not removing the '-' from uuid? Is it illegal in filesystem? I am removing the '-'. This checks for alphanumerics only. - Jonathan --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/#review2548 --- On 2011-10-13 06:31:13, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2379/ bq. --- bq. bq. (Updated 2011-10-13 06:31:13) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. bq. bq. bq. Summary bq. --- bq. bq. Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. bq. bq. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. bq. bq. bq. This addresses bug HBASE-3417. bq. https://issues.apache.org/jira/browse/HBASE-3417 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1182679 bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1182679 bq. bq. Diff: https://reviews.apache.org/r/2379/diff bq. bq. bq. Testing bq. --- bq. bq. TestFromClientSide and TestCacheOnWrite both working bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126834#comment-13126834 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/#review2561 --- Thanks for the reviews stack and mikhail! /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java https://reviews.apache.org/r/2379/#comment5744 I think i'll restore the old behavior of returning null, but if it gets an IOE, that will be bubbled up (that should happen for a harder failure than just a simple rename failure, in which case, we probably want to bubble it up?) - Jonathan On 2011-10-13 06:31:13, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2379/ bq. --- bq. bq. (Updated 2011-10-13 06:31:13) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. bq. bq. bq. Summary bq. --- bq. bq. Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. bq. bq. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. bq. bq. bq. This addresses bug HBASE-3417. bq. https://issues.apache.org/jira/browse/HBASE-3417 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1182679 bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1182679 bq. bq. Diff: https://reviews.apache.org/r/2379/diff bq. bq. bq. Testing bq. --- bq. bq. TestFromClientSide and TestCacheOnWrite both working bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126838#comment-13126838 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- bq. On 2011-10-13 19:24:46, Jonathan Gray wrote: bq. /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, lines 1228-1229 bq. https://reviews.apache.org/r/2379/diff/2/?file=49944#file49944line1228 bq. bq. I think i'll restore the old behavior of returning null, but if it gets an IOE, that will be bubbled up (that should happen for a harder failure than just a simple rename failure, in which case, we probably want to bubble it up?) Sounds good. By the way, is a null return value handled correctly upstream? The whole compaction probably needs to be considered failed in that case. - Mikhail --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/#review2561 --- On 2011-10-13 06:31:13, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2379/ bq. --- bq. bq. (Updated 2011-10-13 06:31:13) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. bq. bq. bq. Summary bq. --- bq. bq. Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. bq. bq. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. bq. bq. bq. This addresses bug HBASE-3417. bq. https://issues.apache.org/jira/browse/HBASE-3417 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1182679 bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1182679 bq. bq. Diff: https://reviews.apache.org/r/2379/diff bq. bq. bq. Testing bq. --- bq. bq. TestFromClientSide and TestCacheOnWrite both working bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126839#comment-13126839 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- bq. On 2011-10-13 19:24:46, Jonathan Gray wrote: bq. /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java, lines 1228-1229 bq. https://reviews.apache.org/r/2379/diff/2/?file=49944#file49944line1228 bq. bq. I think i'll restore the old behavior of returning null, but if it gets an IOE, that will be bubbled up (that should happen for a harder failure than just a simple rename failure, in which case, we probably want to bubble it up?) bq. bq. Mikhail Bautin wrote: bq. Sounds good. By the way, is a null return value handled correctly upstream? The whole compaction probably needs to be considered failed in that case. Guess I should have looked at that. Seems that it is not handled. I'm going to change the behavior to throw an IOException rather than return null. I will kick off the test suite before committing. - Jonathan --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/#review2561 --- On 2011-10-13 06:31:13, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2379/ bq. --- bq. bq. (Updated 2011-10-13 06:31:13) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. bq. bq. bq. Summary bq. --- bq. bq. Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. bq. bq. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. bq. bq. bq. This addresses bug HBASE-3417. bq. https://issues.apache.org/jira/browse/HBASE-3417 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1182679 bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1182679 bq. bq. Diff: https://reviews.apache.org/r/2379/diff bq. bq. bq. Testing bq. --- bq. bq. TestFromClientSide and TestCacheOnWrite both working bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126842#comment-13126842 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/#review2564 --- /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java https://reviews.apache.org/r/2379/#comment5748 I like this suggestion... keeping config immutable and go make a new one if you need to change something... w/ all final. - Michael On 2011-10-13 06:31:13, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2379/ bq. --- bq. bq. (Updated 2011-10-13 06:31:13) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. bq. bq. bq. Summary bq. --- bq. bq. Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. bq. bq. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. bq. bq. bq. This addresses bug HBASE-3417. bq. https://issues.apache.org/jira/browse/HBASE-3417 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1182679 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1182679 bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1182679 bq. bq. Diff: https://reviews.apache.org/r/2379/diff bq. bq. bq. Testing bq. --- bq. bq. TestFromClientSide and TestCacheOnWrite both working bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126841#comment-13126841 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/ --- (Updated 2011-10-13 19:33:02.273799) Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. Changes --- Changes from Mikhail review. Changes it so if we fail rename we throw an exception, otherwise it looks like we would eventually get an NPE down the line? There's no special casing for a null return. Will run test suite before committing. Summary --- Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. This addresses bug HBASE-3417. https://issues.apache.org/jira/browse/HBASE-3417 Diffs (updated) - /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1183046 /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1183046 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183046 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183046 /src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1183046 Diff: https://reviews.apache.org/r/2379/diff Testing --- TestFromClientSide and TestCacheOnWrite both working Thanks, Jonathan CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126847#comment-13126847 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/#review2565 --- /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java https://reviews.apache.org/r/2379/#comment5749 @Mikhail yeah. I figured it out. I was thinking we should allow dashes... then it looks like a uuid and makes compare by human easier. - Michael On 2011-10-13 19:33:02, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2379/ bq. --- bq. bq. (Updated 2011-10-13 19:33:02) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. bq. bq. bq. Summary bq. --- bq. bq. Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. bq. bq. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. bq. bq. bq. This addresses bug HBASE-3417. bq. https://issues.apache.org/jira/browse/HBASE-3417 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1183046 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1183046 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183046 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183046 bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1183046 bq. bq. Diff: https://reviews.apache.org/r/2379/diff bq. bq. bq. Testing bq. --- bq. bq. TestFromClientSide and TestCacheOnWrite both working bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126848#comment-13126848 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- bq. On 2011-10-13 19:31:38, Michael Stack wrote: bq. /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java, line 100 bq. https://reviews.apache.org/r/2379/diff/2/?file=49943#file49943line100 bq. bq. I like this suggestion... keeping config immutable and go make a new one if you need to change something... w/ all final. I like the idea too, but it's more clunky. Then CacheConfig can't be final in things like Store and it becomes not very useful in other tests. I'm going to make the test method names more explicit. - Jonathan --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/#review2564 --- On 2011-10-13 19:33:02, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2379/ bq. --- bq. bq. (Updated 2011-10-13 19:33:02) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. bq. bq. bq. Summary bq. --- bq. bq. Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. bq. bq. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. bq. bq. bq. This addresses bug HBASE-3417. bq. https://issues.apache.org/jira/browse/HBASE-3417 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1183046 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1183046 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183046 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183046 bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1183046 bq. bq. Diff: https://reviews.apache.org/r/2379/diff bq. bq. bq. Testing bq. --- bq. bq. TestFromClientSide and TestCacheOnWrite both working bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126851#comment-13126851 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/ --- (Updated 2011-10-13 19:36:33.845621) Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. Changes --- Renamed setters to be prefixed with: forTestsOnly_ (I'm still opting to keep it this way because I know that the next set of tests I write for the compressed cache stuff is going to want to do the same thing) Summary --- Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. This addresses bug HBASE-3417. https://issues.apache.org/jira/browse/HBASE-3417 Diffs (updated) - /src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1183049 /src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1183049 /src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183049 /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183049 /src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1183049 Diff: https://reviews.apache.org/r/2379/diff Testing --- TestFromClientSide and TestCacheOnWrite both working Thanks, Jonathan CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126853#comment-13126853 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- bq. On 2011-10-13 19:33:27, Michael Stack wrote: bq. /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 166 bq. https://reviews.apache.org/r/2379/diff/2/?file=49945#file49945line166 bq. bq. @Mikhail yeah. I figured it out. I was thinking we should allow dashes... then it looks like a uuid and makes compare by human easier. we had this argument months ago. there's a good consistency in the UIs w/ the current region encoded names (used as dir names) w/ these uuid (minus -) names. The dashes makes mismatched (as I recall my thoughts on the matter way back when it originally had dashes). i stand by the no dashes! :) - Jonathan --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/#review2565 --- On 2011-10-13 19:36:33, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2379/ bq. --- bq. bq. (Updated 2011-10-13 19:36:33) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. bq. bq. bq. Summary bq. --- bq. bq. Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. bq. bq. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. bq. bq. bq. This addresses bug HBASE-3417. bq. https://issues.apache.org/jira/browse/HBASE-3417 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1183049 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1183049 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183049 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183049 bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1183049 bq. bq. Diff: https://reviews.apache.org/r/2379/diff bq. bq. bq. Testing bq. --- bq. bq. TestFromClientSide and TestCacheOnWrite both working bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13126856#comment-13126856 ] jirapos...@reviews.apache.org commented on HBASE-3417: -- bq. On 2011-10-13 19:33:27, Michael Stack wrote: bq. /src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java, line 166 bq. https://reviews.apache.org/r/2379/diff/2/?file=49945#file49945line166 bq. bq. @Mikhail yeah. I figured it out. I was thinking we should allow dashes... then it looks like a uuid and makes compare by human easier. bq. bq. Jonathan Gray wrote: bq. we had this argument months ago. there's a good consistency in the UIs w/ the current region encoded names (used as dir names) w/ these uuid (minus -) names. The dashes makes mismatched (as I recall my thoughts on the matter way back when it originally had dashes). bq. bq. i stand by the no dashes! :) I took a look. The UUID ends up being the key in a LRU cache. I'm fine w/ it (I argued for keeping them back then too... sorry for rehearsing old arg) - Michael --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2379/#review2565 --- On 2011-10-13 19:36:33, Jonathan Gray wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/2379/ bq. --- bq. bq. (Updated 2011-10-13 19:36:33) bq. bq. bq. Review request for hbase, Dhruba Borthakur, Michael Stack, and Mikhail Bautin. bq. bq. bq. Summary bq. --- bq. bq. Adds a new test that fails w/o this fix and changes the naming scheme for storefiles to use UUID instead of random longs as ascii. bq. bq. The big change is that the name of the tmp file used when flushing and compacting is the same name (but different dir) when moved in place. This makes it so block names are consistent for COW. bq. bq. bq. This addresses bug HBASE-3417. bq. https://issues.apache.org/jira/browse/HBASE-3417 bq. bq. bq. Diffs bq. - bq. bq./src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java 1183049 bq./src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java 1183049 bq./src/main/java/org/apache/hadoop/hbase/regionserver/Store.java 1183049 bq./src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java 1183049 bq./src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java 1183049 bq. bq. Diff: https://reviews.apache.org/r/2379/diff bq. bq. bq. Testing bq. --- bq. bq. TestFromClientSide and TestCacheOnWrite both working bq. bq. bq. Thanks, bq. bq. Jonathan bq. bq. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127018#comment-13127018 ] Jonathan Gray commented on HBASE-3417: -- Stack, I was going to open a new JIRA, but it is the exact same issue and a nearly identical patch (primary difference is pre/post hfile v2). It was just incorrect to close this following commit of hfile v2 which was unrelated to this bug. Nothing was ever committed under this JIRA so just reopened with an updated patch. I think things get confusing when there is more than one commit per branch per jira. We should probably ban that practice. Or at least institute some kind of standardized commit message (HBASE-3417, HBASE-3417-B, HBASE-3417-C, etc) or some such thing. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127051#comment-13127051 ] Jonathan Gray commented on HBASE-3417: -- I didn't mark as incompatible but it is only one-way compatible. There is actually a very trivial change that can be made in the 0.90 branch (or any other branches) to make this change compatible in all directions. Just need to update the REF_NAME_PARSER regex to be what it is in this change (tolerant of [a-f] in addition to digits). That's it. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127054#comment-13127054 ] Jonathan Gray commented on HBASE-3417: -- In StoreFile.java: {code} private static final Pattern REF_NAME_PARSER = -Pattern.compile(^(\\d+)(?:\\.(.+))?$); +Pattern.compile(^([0-9a-f]+)(?:\\.(.+))?$); {code} If you ever need to go backwards from 92 to a previous version. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13127197#comment-13127197 ] Hudson commented on HBASE-3417: --- Integrated in HBase-0.92 #63 (See [https://builds.apache.org/job/HBase-0.92/63/]) HBASE-3417 CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme (jgray) jgray : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/AbstractHFileWriter.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/io/hfile/CacheConfig.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/client/TestFromClientSide.java CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-redux-v1.patch, HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069204#comment-13069204 ] Jonathan Gray commented on HBASE-3417: -- It does support COW but if it doesn't include changes to how files are named, it will still need this fix. Will follow-up with Mikhail. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069216#comment-13069216 ] Mikhail Bautin commented on HBASE-3417: --- @stack: This patch is already part of HFile-v2. Thanks! CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009396#comment-13009396 ] stack commented on HBASE-3417: -- You running this patch still Jon or did you do something else? CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009449#comment-13009449 ] Jonathan Gray commented on HBASE-3417: -- Just verified that this is the same as what we have been running with in production (since the patch was put up in January). I'm ready to commit if you want to +1 me :) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009537#comment-13009537 ] stack commented on HBASE-3417: -- +1 on patch for 0.92 (But don't include the TestResettingCounters pollution). CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978427#action_12978427 ] stack commented on HBASE-3417: -- Should you add in A-Z in below just in case? {code} +Pattern.compile(^([0-9a-f]+)(?:\\.(.+))?$); {code} Yeah, don't replace the '-' I'd say: {code} +return new Path(dir, UUID.randomUUID().toString().replaceAll(-, ) ++ ((suffix == null || suffix.length() = 0) ? : suffix)); {code} Then its easy to go back to UUID. You might want to do that so you can use the 128 bits as key in LRU rather than String? Otherwise, +1 IFF verified backward compatible. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978431#action_12978431 ] Jonathan Gray commented on HBASE-3417: -- bq. Should you add in A-Z in below just in case? Could add A-F (uuid is hex chars only), but it's unnecessary. bq. Then its easy to go back to UUID. You might want to do that so you can use the 128 bits as key in LRU rather than String? LRU uses a String for block name. I think it looks much nicer with a consistent looking naming scheme for region directories and storefiles. And I don't think we need to be overly concerned about the size... If 64K block, in the LRU we're talking about 0.05% overhead (or like 0.02% over a more compact version). Also, traditional GUID format reminds me of Microsoft SQL Server :) This latest v5 patch is being deployed on a 100 node cluster with existing data tonight. Will commit once verified that it's working there. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch, HBASE-3417-v5.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977943#action_12977943 ] Jonathan Gray commented on HBASE-3417: -- One idea from discussion with stack is to use a UUID for the filename. That way we can generate it once for the temporary file and then just move it in place without doing a rename. Would then just use UUID + blockNumber as the blockName. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978122#action_12978122 ] stack commented on HBASE-3417: -- As discussed up on IRC, this is not backward compatible: {code} +Pattern.compile(^(\\w{32})(?:\\.(.+))?$); {code} You can do a range IIRC 20-32 (was old length 20 chars?) The below is a little bit messy.: {code} +return new Path(dir, UUID.randomUUID().toString().replaceAll(-, ) ++ ((suffix == null || suffix.length() = 0) ? : suffix)); {code} Up on IRC, was thinking should base64 because then it'd be more compact. See http://stackoverflow.com/questions/772802/storing-uuid-as-base64-string. There is also in hbase util a Base64#encodeBytes method that will take the 128 UUID bits and emit them as base64 (Possible to get it all down to 22 chars). But looking at the base64 vocabulary, http://en.wikipedia.org/wiki/Base64, it includes '+' and '/' which are illegal in URL, a hdfs filepath. Base32? http://en.wikipedia.org/wiki/Base32? But that won't work either. Has to be multiples of 40 bits. Maybe leave it as it comes out of UUID.toString w/ hyphens. Then its plain its a UUID and its easier to read? CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978124#action_12978124 ] Jonathan Gray commented on HBASE-3417: -- I changed regex to be {{([0-9a-z]+)}} I kind of like how it is. It looks just like the encoded region names used for region directory names. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-3417) CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme
[ https://issues.apache.org/jira/browse/HBASE-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978125#action_12978125 ] Jonathan Gray commented on HBASE-3417: -- Old random file name was using rand.nextLong() so it could be any length = 1. CacheOnWrite is using the temporary output path for block names, need to use a more consistent block naming scheme -- Key: HBASE-3417 URL: https://issues.apache.org/jira/browse/HBASE-3417 Project: HBase Issue Type: Bug Components: io, regionserver Affects Versions: 0.92.0 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3417-v1.patch, HBASE-3417-v2.patch Currently the block names used in the block cache are built using the filesystem path. However, for cache on write, the path is a temporary output file. The original COW patch actually made some modifications to block naming stuff to make it more consistent but did not do enough. Should add a separate method somewhere for generating block names using some more easily mocked scheme (rather than just raw path as we generate a random unique file name twice, once for tmp and then again when moved into place). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.