[jira] [Commented] (HBASE-6028) Implement a cancel for in-progress compactions
[ https://issues.apache.org/jira/browse/HBASE-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14617698#comment-14617698 ] Ishan Chhabra commented on HBASE-6028: -- [~esteban], are you working on this? > Implement a cancel for in-progress compactions > -- > > Key: HBASE-6028 > URL: https://issues.apache.org/jira/browse/HBASE-6028 > Project: HBase > Issue Type: Bug > Components: regionserver >Reporter: Derek Wollenstein >Assignee: Esteban Gutierrez >Priority: Minor > Labels: beginner > > Depending on current server load, it can be extremely expensive to run > periodic minor / major compactions. It would be helpful to have a feature > where a user could use the shell or a client tool to explicitly cancel an > in-progress compactions. This would allow a system to recover when too many > regions became eligible for compactions at once -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-11789) LoadIncrementalHFiles is not picking up the -D option
[ https://issues.apache.org/jira/browse/HBASE-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14105762#comment-14105762 ] Ishan Chhabra commented on HBASE-11789: --- Can you apply this to 0.96 and 0.94 head as well? > LoadIncrementalHFiles is not picking up the -D option > -- > > Key: HBASE-11789 > URL: https://issues.apache.org/jira/browse/HBASE-11789 > Project: HBase > Issue Type: Bug >Affects Versions: 0.99.0, 0.98.5, 2.0.0 >Reporter: Matteo Bertozzi >Assignee: Matteo Bertozzi > Fix For: 0.99.0, 2.0.0, 0.98.6 > > Attachments: HBASE-11789-v0.patch > > > LoadIncrementalHFiles is not using the Tool class correctly, preventing to > use the -D options. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11642) EOL 0.96
[ https://issues.apache.org/jira/browse/HBASE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085992#comment-14085992 ] Ishan Chhabra commented on HBASE-11642: --- Hmm, I agree and can relate to the pain associated with a co-ordinated upgrade having gone through it myself. Thinking about it more, I agree a full EOL is not a good idea for 0.94, but is there are way to make its status more clearer and explicit (maybe "extended maintenance" like python 2.x) so that people running smaller clusters would consider upgrades and newer users don't use 0.94 accidentally (I see some of them doing that mostly because they start with CDH 4). It is easy for a casual observer to think that 0.94 is having active releases, whereas it is made clear in the JIRAs that new features should not be added to this branch. As a side note, this line should be definitely fixed in the download pages (eg. http://www.carfab.com/apachesoftware/hbase/) "The 0.96.x series supercedes 0.94.x. We are leaving the 'stable' pointer on the latest 0.94.x for now while 0.96 is still 'fresh'." > EOL 0.96 > > > Key: HBASE-11642 > URL: https://issues.apache.org/jira/browse/HBASE-11642 > Project: HBase > Issue Type: Task >Reporter: stack >Assignee: stack > > Do the work to EOL 0.96. > + No more patches on 0.96 > + Remove 0.96 from downloads. > + If user has issue with 0.96 and needs fix, fix it in 0.98 and have the user > upgrade to get the fix. > + Write email to user list stating 0.96 has been EOL'd September 1st? And add > notice to refguide. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11642) EOL 0.96
[ https://issues.apache.org/jira/browse/HBASE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085885#comment-14085885 ] Ishan Chhabra commented on HBASE-11642: --- [~lhofhansl], I agree with your reasoning of maintaining 0.94 a little longer (given the incompatibility and a stop the world upgrade process), but I am worried we might end up in a Python 2.x 3.x like scenario. Declaring an EOL for 0.94 might be a good way to trigger upgrades for many clients who have not done it yet and will bring more focus to the community (I see many JIRAs with won't backport for 0.94, but some patches attached, and people will keep on coming back to the mailing lists seeking help for these and other issues). > EOL 0.96 > > > Key: HBASE-11642 > URL: https://issues.apache.org/jira/browse/HBASE-11642 > Project: HBase > Issue Type: Task >Reporter: stack >Assignee: stack > > Do the work to EOL 0.96. > + No more patches on 0.96 > + Remove 0.96 from downloads. > + If user has issue with 0.96 and needs fix, fix it in 0.98 and have the user > upgrade to get the fix. > + Write email to user list stating 0.96 has been EOL'd September 1st? And add > notice to refguide. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11642) EOL 0.96
[ https://issues.apache.org/jira/browse/HBASE-11642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085154#comment-14085154 ] Ishan Chhabra commented on HBASE-11642: --- Is there a plan to EOL 0.94 soon as well? > EOL 0.96 > > > Key: HBASE-11642 > URL: https://issues.apache.org/jira/browse/HBASE-11642 > Project: HBase > Issue Type: Task >Reporter: stack >Assignee: stack > > Do the work to EOL 0.96. > + No more patches on 0.96 > + Remove 0.96 from downloads. > + If user has issue with 0.96 and needs fix, fix it in 0.98 and have the user > upgrade to get the fix. > + Write email to user list stating 0.96 has been EOL'd September 1st? And add > notice to refguide. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11635) Deprecate TableMapReduceUtil.setScannerCaching
[ https://issues.apache.org/jira/browse/HBASE-11635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14084206#comment-14084206 ] Ishan Chhabra commented on HBASE-11635: --- [~ndimiduk], The config is also used by hbase client to decide the value of caching if it is not set on the scan object. Here is the relevant piece of code from ClientScanner.java {code} // Use the caching from the Scan. If not set, use the default cache setting for this table. if (this.scan.getCaching() > 0) { this.caching = this.scan.getCaching(); } else { this.caching = conf.getInt( HConstants.HBASE_CLIENT_SCANNER_CACHING, HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING); } {code} If we deprecate and remove the config, it also means that you cannot set caching at the client side using this config. Is that ok? > Deprecate TableMapReduceUtil.setScannerCaching > -- > > Key: HBASE-11635 > URL: https://issues.apache.org/jira/browse/HBASE-11635 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Affects Versions: 1.0.0, 0.98.4, 2.0.0 >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > > See discussion in HBASE-11558. > Currently there are 2 ways to specify scanner caching when invoking a MR job > using TableMapReduceUtil. > 1. By setting the caching on the Scan Object. > 2. By setting the "hbase.client.scanner.caching" config using > TableMapReduceUtil.setScannerCaching. > This JIRA attempts to deprecate the latter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11635) Deprecate TableMapReduceUtil.setScannerCaching
Ishan Chhabra created HBASE-11635: - Summary: Deprecate TableMapReduceUtil.setScannerCaching Key: HBASE-11635 URL: https://issues.apache.org/jira/browse/HBASE-11635 Project: HBase Issue Type: Improvement Components: mapreduce Affects Versions: 0.98.4, 1.0.0, 2.0.0 Reporter: Ishan Chhabra Assignee: Ishan Chhabra See discussion in HBASE-11558. Currently there are 2 ways to specify scanner caching when invoking a MR job using TableMapReduceUtil. 1. By setting the caching on the Scan Object. 2. By setting the "hbase.client.scanner.caching" config using TableMapReduceUtil.setScannerCaching. This JIRA attempts to deprecate the latter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11635) Deprecate TableMapReduceUtil.setScannerCaching
[ https://issues.apache.org/jira/browse/HBASE-11635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081345#comment-14081345 ] Ishan Chhabra commented on HBASE-11635: --- [~ndimiduk], I might have read your comment wrong, but were you also saying that we should get rid of the config altogether? > Deprecate TableMapReduceUtil.setScannerCaching > -- > > Key: HBASE-11635 > URL: https://issues.apache.org/jira/browse/HBASE-11635 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Affects Versions: 1.0.0, 0.98.4, 2.0.0 >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > > See discussion in HBASE-11558. > Currently there are 2 ways to specify scanner caching when invoking a MR job > using TableMapReduceUtil. > 1. By setting the caching on the Scan Object. > 2. By setting the "hbase.client.scanner.caching" config using > TableMapReduceUtil.setScannerCaching. > This JIRA attempts to deprecate the latter. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-11558: -- Release Note: TableMapReduceUtil now restores the option to set scanner caching by setting it on the scanner object. The priority order for choosing the scanner caching is as follows: 1. Caching set on the scan object. 2. Caching specified via the config "hbase.client.scanner.caching", which can either be set manually on the conf or via the helper method TableMapReduceUtil.setScannerCaching(). 3. The default value HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING, which is set to 100 currently. > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080304#comment-14080304 ] Ishan Chhabra commented on HBASE-11558: --- Updated release notes. It makes sense to remove the second method. Do you propose to delete the method or mark it as deprecated for now? Which branches should get this patch? I can open a separate JIRA and put in the patch there once the answers are clear. > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-11558: -- Release Note: TableMapReduceUtil now restores the option to set scanner caching by setting it on the Scan object that is passe in. The priority order for choosing the scanner caching is as follows: 1. Caching set on the scan object. 2. Caching specified via the config "hbase.client.scanner.caching", which can either be set manually on the conf or via the helper method TableMapReduceUtil.setScannerCaching(). 3. The default value HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING, which is set to 100 currently. was: TableMapReduceUtil now restores the option to set scanner caching by setting it on the scanner object. The priority order for choosing the scanner caching is as follows: 1. Caching set on the scan object. 2. Caching specified via the config "hbase.client.scanner.caching", which can either be set manually on the conf or via the helper method TableMapReduceUtil.setScannerCaching(). 3. The default value HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING, which is set to 100 currently. > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078995#comment-14078995 ] Ishan Chhabra commented on HBASE-11558: --- [~ndimiduk], can you +1 and commit? > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078381#comment-14078381 ] Ishan Chhabra commented on HBASE-11558: --- Test failures are due to HBASE-11316 > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078385#comment-14078385 ] Ishan Chhabra commented on HBASE-11558: --- [~apurtell], how can I trigger the build for 0.96 and 0.98 patches? > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch, HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-11558: -- Status: Patch Available (was: Open) > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-11558: -- Attachment: HBASE_11558-0.98_v2.patch HBASE_11558-0.96_v2.patch HBASE_11558_v2.patch Fixed ProtobufUtil test and enhanced it a bit. PTAL. > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-11558: -- Status: Open (was: Patch Available) > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch, > HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch, > HBASE_11558_v2.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-11558: -- Fix Version/s: 0.96.3 Status: Patch Available (was: Open) Attached patch and added 0.96 as a fix version. > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.98.patch, > HBASE_11558.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-11558: -- Attachment: HBASE_11558.patch HBASE_11558-0.98.patch HBASE_11558-0.96.patch Patch for trunk, 96 and 98. > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0, 0.98.5, 2.0.0 > > Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.98.patch, > HBASE_11558.patch > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073049#comment-14073049 ] Ishan Chhabra commented on HBASE-11558: --- [~apurtell], If caching is set during a general scan (not MapReduce), it will be serialized and sent in the openScanner request even though it is not needed. However, it would just be 3-4 bytes more overhead, and only in the openScanner call and not the next call. If this is ok, I would be happy to put a patch up. > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Reporter: Ishan Chhabra >Assignee: Andrew Purtell > Fix For: 0.99.0, 0.98.5, 2.0.0 > > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069719#comment-14069719 ] Ishan Chhabra commented on HBASE-11558: --- Unfortunately our configuration has this value set to 1 (carried over from the default in 0.94) and we faced this issue. Another fellow in the mailing list got perplexed because of this (not sure if he went from 5000 to 100 or 5000 to 1). > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: mapreduce, Scanners >Affects Versions: 0.98.0, 0.95.0, 0.96.0 >Reporter: Ishan Chhabra > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
[ https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-11558: -- Affects Version/s: 0.98.0 0.96.0 > Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ > --- > > Key: HBASE-11558 > URL: https://issues.apache.org/jira/browse/HBASE-11558 > Project: HBase > Issue Type: Bug > Components: Scanners >Affects Versions: 0.98.0, 0.95.0, 0.96.0 >Reporter: Ishan Chhabra > > 0.94 and before, if one sets caching on the Scan object in the Job by calling > scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly > read and used by the mappers during a mapreduce job. This is because > Scan.write respects and serializes caching, which is used internally by > TableMapReduceUtil to serialize and transfer the scan object to the mappers. > 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect > caching anymore as ClientProtos.Scan does not have the field caching. Caching > is passed via the ScanRequest object to the server and so is not needed in > the Scan object. However, this breaks application code that relies on the > earlier behavior. This will lead to sudden degradation in Scan performance > 0.96+ for users relying on the old behavior. > There are 2 options here: > 1. Add caching to Scan object, adding an extra int to the payload for the > Scan object which is really not needed in the general case. > 2. Document and preach that TableMapReduceUtil.setScannerCaching must be > called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HBASE-11558) Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
Ishan Chhabra created HBASE-11558: - Summary: Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+ Key: HBASE-11558 URL: https://issues.apache.org/jira/browse/HBASE-11558 Project: HBase Issue Type: Bug Components: Scanners Affects Versions: 0.95.0 Reporter: Ishan Chhabra 0.94 and before, if one sets caching on the Scan object in the Job by calling scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly read and used by the mappers during a mapreduce job. This is because Scan.write respects and serializes caching, which is used internally by TableMapReduceUtil to serialize and transfer the scan object to the mappers. 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect caching anymore as ClientProtos.Scan does not have the field caching. Caching is passed via the ScanRequest object to the server and so is not needed in the Scan object. However, this breaks application code that relies on the earlier behavior. This will lead to sudden degradation in Scan performance 0.96+ for users relying on the old behavior. There are 2 options here: 1. Add caching to Scan object, adding an extra int to the payload for the Scan object which is really not needed in the general case. 2. Document and preach that TableMapReduceUtil.setScannerCaching must be called by the client. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-11552) Read/Write requests count metric value is too short
[ https://issues.apache.org/jira/browse/HBASE-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14068959#comment-14068959 ] Ishan Chhabra commented on HBASE-11552: --- This duplicates https://issues.apache.org/jira/browse/HBASE-10645 > Read/Write requests count metric value is too short > --- > > Key: HBASE-11552 > URL: https://issues.apache.org/jira/browse/HBASE-11552 > Project: HBase > Issue Type: Bug > Components: metrics >Affects Versions: 0.94.21 >Reporter: Adrian Muraru > Fix For: 0.94.22 > > Attachments: HBASE-11552_0.94_v1.diff > > > I am using {{readRequestsCount}} and {{writeRequestsCount}} counters to plot > HBase activity in opentsdb and noticed that they are exported as int value > although the underlying counter backed by a {{long}} counter. > Metric should be a {{long}} as well. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10646) Enable security features by default for 1.0
[ https://issues.apache.org/jira/browse/HBASE-10646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14017200#comment-14017200 ] Ishan Chhabra commented on HBASE-10646: --- Will main RPCs like Get, Put, etc (apart from the admin RPCs) also be secured after that change? Any extra overhead in these RPCs would be unacceptable in our use case. Also, +1 for a simple security = false option. I believe many users don't need security and any extra overhead (in terms of deployment complexity or runtime overheads) would not be preferable. > Enable security features by default for 1.0 > --- > > Key: HBASE-10646 > URL: https://issues.apache.org/jira/browse/HBASE-10646 > Project: HBase > Issue Type: Task >Affects Versions: 0.99.0 >Reporter: Andrew Purtell > > As discussed in the last PMC meeting, we should enable security features by > default in 1.0. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10921) Port HBASE-10323 'Auto detect data block encoding in HFileOutputFormat' to 0.94 / 0.96
[ https://issues.apache.org/jira/browse/HBASE-10921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13963297#comment-13963297 ] Ishan Chhabra commented on HBASE-10921: --- HBASE-10323 had a patch attached for 0.94. Is the one attached here the same, just rebased on top of 0.94 head? > Port HBASE-10323 'Auto detect data block encoding in HFileOutputFormat' to > 0.94 / 0.96 > -- > > Key: HBASE-10921 > URL: https://issues.apache.org/jira/browse/HBASE-10921 > Project: HBase > Issue Type: Task >Affects Versions: 0.96.2, 0.94.18 >Reporter: Ted Yu >Assignee: Kashif J S > Fix For: 0.94.19, 0.96.3 > > Attachments: HBASE-10921-0.94-v1.patch, HBASE-10921-0.96-v1.patch > > > This issue is to backport auto detection of data block encoding in > HFileOutputFormat to 0.94 and 0.96 branches. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HBASE-10380) Add bytesBinary and filter options to CopyTable
[ https://issues.apache.org/jira/browse/HBASE-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878402#comment-13878402 ] Ishan Chhabra commented on HBASE-10380: --- Sure. I didn't know about ParseFiler. I tried to build my own textual language initially, but it became complicated quickly. > Add bytesBinary and filter options to CopyTable > --- > > Key: HBASE-10380 > URL: https://issues.apache.org/jira/browse/HBASE-10380 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra >Priority: Minor > Attachments: HBASE_10380_0.94-v1.patch > > > Add options in CopyTable to: > 1. Specify the start and stop row in "bytesBinary" format > 2. Use filters -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10380) Add bytesBinary and filter options to CopyTable
[ https://issues.apache.org/jira/browse/HBASE-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876880#comment-13876880 ] Ishan Chhabra commented on HBASE-10380: --- Ok. Ill submit a trunk patch then. I tend to create 0.94 patches first since we are running that internally. > Add bytesBinary and filter options to CopyTable > --- > > Key: HBASE-10380 > URL: https://issues.apache.org/jira/browse/HBASE-10380 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra >Priority: Minor > Attachments: HBASE_10380_0.94-v1.patch > > > Add options in CopyTable to: > 1. Specify the start and stop row in "bytesBinary" format > 2. Use filters -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10380) Add bytesBinary and filter options to CopyTable
[ https://issues.apache.org/jira/browse/HBASE-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876259#comment-13876259 ] Ishan Chhabra commented on HBASE-10380: --- If the approach looks good, then I can build a patch for trunk. > Add bytesBinary and filter options to CopyTable > --- > > Key: HBASE-10380 > URL: https://issues.apache.org/jira/browse/HBASE-10380 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra >Priority: Minor > Attachments: HBASE_10380_0.94-v1.patch > > > Add options in CopyTable to: > 1. Specify the start and stop row in "bytesBinary" format > 2. Use filters -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10380) Add bytesBinary and filter options to CopyTable
[ https://issues.apache.org/jira/browse/HBASE-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876257#comment-13876257 ] Ishan Chhabra commented on HBASE-10380: --- For filters, the patch allows one to specify a file containing the filter in a serialized form. This seemed to be the only generic way to specify filters and allows complex filters (including filter lists). > Add bytesBinary and filter options to CopyTable > --- > > Key: HBASE-10380 > URL: https://issues.apache.org/jira/browse/HBASE-10380 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra >Priority: Minor > Attachments: HBASE_10380_0.94-v1.patch > > > Add options in CopyTable to: > 1. Specify the start and stop row in "bytesBinary" format > 2. Use filters -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10380) Add bytesBinary and filter options to CopyTable
[ https://issues.apache.org/jira/browse/HBASE-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10380: -- Description: Add options in CopyTable to: 1. Specify the start and stop row in "bytesBinary" format 2. Use filters was: Add options in CopyTable to: 1. specify the start and stop row in "bytesBinary" format 2. Use filters > Add bytesBinary and filter options to CopyTable > --- > > Key: HBASE-10380 > URL: https://issues.apache.org/jira/browse/HBASE-10380 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra >Priority: Minor > Attachments: HBASE_10380_0.94-v1.patch > > > Add options in CopyTable to: > 1. Specify the start and stop row in "bytesBinary" format > 2. Use filters -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10380) Add bytesBinary and filter options to CopyTable
[ https://issues.apache.org/jira/browse/HBASE-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10380: -- Attachment: HBASE_10380_0.94-v1.patch > Add bytesBinary and filter options to CopyTable > --- > > Key: HBASE-10380 > URL: https://issues.apache.org/jira/browse/HBASE-10380 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra >Priority: Minor > Attachments: HBASE_10380_0.94-v1.patch > > > Add options in CopyTable to: > 1. specify the start and stop row in "bytesBinary" format > 2. Use filters -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10380) Add bytesBinary and filter options to CopyTable
[ https://issues.apache.org/jira/browse/HBASE-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10380: -- Description: Add options in CopyTable to: 1. specify the start and stop row in "bytesBinary" format 2. Use filters > Add bytesBinary and filter options to CopyTable > --- > > Key: HBASE-10380 > URL: https://issues.apache.org/jira/browse/HBASE-10380 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra >Priority: Minor > > Add options in CopyTable to: > 1. specify the start and stop row in "bytesBinary" format > 2. Use filters -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10380) Add bytesBinary and filter options to CopyTable
[ https://issues.apache.org/jira/browse/HBASE-10380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10380: -- Status: Patch Available (was: Open) > Add bytesBinary and filter options to CopyTable > --- > > Key: HBASE-10380 > URL: https://issues.apache.org/jira/browse/HBASE-10380 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra >Priority: Minor > > Add options in CopyTable to: > 1. specify the start and stop row in "bytesBinary" format > 2. Use filters -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10380) Add bytesBinary and filter options to CopyTable
Ishan Chhabra created HBASE-10380: - Summary: Add bytesBinary and filter options to CopyTable Key: HBASE-10380 URL: https://issues.apache.org/jira/browse/HBASE-10380 Project: HBase Issue Type: Improvement Components: mapreduce Reporter: Ishan Chhabra Assignee: Ishan Chhabra Priority: Minor -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13876054#comment-13876054 ] Ishan Chhabra commented on HBASE-10323: --- Added the @VisibleForTesting annotations where needed and fixed the '{' in newline. I didn't make the constants package-private since no other class needs them at the moment. When some other class in the package or a test needs it, they could be made package private then. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10323: -- Attachment: HBASE_10323-trunk-v4.patch HBASE_10323-0.94.15-v5.patch > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-0.94.15-v5.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch, > HBASE_10323-trunk-v3.patch, HBASE_10323-trunk-v4.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873071#comment-13873071 ] Ishan Chhabra commented on HBASE-10323: --- Can someone else looks and +1? [~lhofhansl]? > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement > Components: mapreduce >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-trunk-v1.patch, > HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10323: -- Attachment: HBASE_10323-trunk-v3.patch HBASE_10323-0.94.15-v4.patch Changed trunk patch to work directly with DataBlockingEncoding instead of HfileDataBlockEncoder. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Fix For: 0.99.0 > > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-0.94.15-v4.patch, HBASE_10323-trunk-v1.patch, > HBASE_10323-trunk-v2.patch, HBASE_10323-trunk-v3.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869325#comment-13869325 ] Ishan Chhabra commented on HBASE-10323: --- I was able to run the maven site successfully on my box. Can't figure out why it is failing based on the console output. Can somebody help? > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10323: -- Attachment: HBASE_10323-trunk-v2.patch > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10323: -- Attachment: (was: HBASE_10323-trunk-v2.patch) > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-trunk-v1.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10323: -- Attachment: HBASE_10323-trunk-v2.patch HBASE_10323-0.94.15-v3.patch > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-0.94.15-v3.patch, > HBASE_10323-trunk-v1.patch, HBASE_10323-trunk-v2.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13869184#comment-13869184 ] Ishan Chhabra commented on HBASE-10323: --- Added javadoc for parameters and uploaded patch for trunk. [~lhofhansl], what else shoud be auto detected? I can add that as a part of this or a separate JIRA. > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-trunk-v1.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10323: -- Attachment: HBASE_10323-trunk-v1.patch HBASE_10323-0.94.15-v2.patch > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch, > HBASE_10323-0.94.15-v2.patch, HBASE_10323-trunk-v1.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10323: -- Attachment: HBASE_10323-0.94.15-v1.patch > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
[ https://issues.apache.org/jira/browse/HBASE-10323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-10323: -- Status: Patch Available (was: Open) > Auto detect data block encoding in HFileOutputFormat > > > Key: HBASE-10323 > URL: https://issues.apache.org/jira/browse/HBASE-10323 > Project: HBase > Issue Type: Improvement >Reporter: Ishan Chhabra >Assignee: Ishan Chhabra > Attachments: HBASE_10323-0.94.15-v1.patch > > > Currently, one has to specify the data block encoding of the table explicitly > using the config parameter > "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload > load. This option is easily missed, not documented and also works differently > than compression, block size and bloom filter type, which are auto detected. > The solution would be to add support to auto detect datablock encoding > similar to other parameters. > The current patch does the following: > 1. Automatically detects datablock encoding in HFileOutputFormat. > 2. Keeps the legacy option of manually specifying the datablock encoding > around as a method to override auto detections. > 3. Moves string conf parsing to the start of the program so that it fails > fast during starting up instead of failing during record writes. It also > makes the internals of the program type safe. > 4. Adds missing doc strings and unit tests for code serializing and > deserializing config paramerters for bloom filer type, block size and > datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HBASE-10323) Auto detect data block encoding in HFileOutputFormat
Ishan Chhabra created HBASE-10323: - Summary: Auto detect data block encoding in HFileOutputFormat Key: HBASE-10323 URL: https://issues.apache.org/jira/browse/HBASE-10323 Project: HBase Issue Type: Improvement Reporter: Ishan Chhabra Assignee: Ishan Chhabra Currently, one has to specify the data block encoding of the table explicitly using the config parameter "hbase.mapreduce.hfileoutputformat.datablock.encoding" when doing a bulkload load. This option is easily missed, not documented and also works differently than compression, block size and bloom filter type, which are auto detected. The solution would be to add support to auto detect datablock encoding similar to other parameters. The current patch does the following: 1. Automatically detects datablock encoding in HFileOutputFormat. 2. Keeps the legacy option of manually specifying the datablock encoding around as a method to override auto detections. 3. Moves string conf parsing to the start of the program so that it fails fast during starting up instead of failing during record writes. It also makes the internals of the program type safe. 4. Adds missing doc strings and unit tests for code serializing and deserializing config paramerters for bloom filer type, block size and datablock encoding. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HBASE-9934) Mesh replication (a.k.a. multi master replication)
[ https://issues.apache.org/jira/browse/HBASE-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819887#comment-13819887 ] Ishan Chhabra commented on HBASE-9934: -- [~nidmhbase] Lars' understanding is correct. That is why this should be specified at peer level and not CF level. Also, when per cf peer definition will be supported, this would fit in automatically at peer level. [~lhofhansl] Ill give this a shot. Can you assign this to me? > Mesh replication (a.k.a. multi master replication) > -- > > Key: HBASE-9934 > URL: https://issues.apache.org/jira/browse/HBASE-9934 > Project: HBase > Issue Type: New Feature > Components: Replication >Reporter: Ishan Chhabra >Priority: Minor > > This is to setup NxN replication. > See background discussion here: > http://mail-archives.apache.org/mod_mbox/hbase-user/201311.mbox/%3CCAOiuM-4UMmLA7UHMp4hhjpLWUrHDxg1t4tN4aWvnZUMcTxG%2BKQ%40mail.gmail.com%3E > We can add a new mode in replication to not forward edits from other > clusters. Not sure what should be done when some clusters are configured with > this setting and some aren't. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9950) Row level replication
[ https://issues.apache.org/jira/browse/HBASE-9950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819629#comment-13819629 ] Ishan Chhabra commented on HBASE-9950: -- [~stack] not sure yet. Since there is no notion of row level data in hbase storage, I would have to create some special KVs that are stored for the row, which sounds very hacky. [~apurtell] Replication scope is defined at the CF level, so I don't think Ill be able to use it. I do need to plug in custom replication policy though if this is not a core feature. There are no observers for replication, are there? > Row level replication > - > > Key: HBASE-9950 > URL: https://issues.apache.org/jira/browse/HBASE-9950 > Project: HBase > Issue Type: Bug > Components: Replication >Reporter: Ishan Chhabra >Priority: Minor > > We have a replication setup with the same table and column family being > present in multiple data centers. Currently, all of them have exactly the > same data, but each cluster doesn't need all the data. Rows need to be > present in only x out of the total y clusters. This information varies at the > row level and thus more granular replication cannot be achieved by setting up > cluster level replication. > Adding row level replication should solve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9934) Mesh replication (a.k.a. multi master replication)
[ https://issues.apache.org/jira/browse/HBASE-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819610#comment-13819610 ] Ishan Chhabra commented on HBASE-9934: -- Ok. The above one is not right again. The spaces are removed when I save and the weird strike though comes in. C1 and C2 are connected to C3 and not C4. > Mesh replication (a.k.a. multi master replication) > -- > > Key: HBASE-9934 > URL: https://issues.apache.org/jira/browse/HBASE-9934 > Project: HBase > Issue Type: New Feature > Components: Replication >Reporter: Ishan Chhabra >Priority: Minor > > This is to setup NxN replication. > See background discussion here: > http://mail-archives.apache.org/mod_mbox/hbase-user/201311.mbox/%3CCAOiuM-4UMmLA7UHMp4hhjpLWUrHDxg1t4tN4aWvnZUMcTxG%2BKQ%40mail.gmail.com%3E > We can add a new mode in replication to not forward edits from other > clusters. Not sure what should be done when some clusters are configured with > this setting and some aren't. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9934) Mesh replication (a.k.a. multi master replication)
[ https://issues.apache.org/jira/browse/HBASE-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819608#comment-13819608 ] Ishan Chhabra commented on HBASE-9934: -- That is the first proposal. It will not work where people have a mixed setup. The 2nd one, where a link (source, sink) pair is specified to be belonging to a mesh network should work better, but might be more dev work. Also, when you were cleaning my comment, the diagram got changed. Below is the fixed version. C4 <-> C3 <-> C5 <-> C6 / \ C1 - C2 > Mesh replication (a.k.a. multi master replication) > -- > > Key: HBASE-9934 > URL: https://issues.apache.org/jira/browse/HBASE-9934 > Project: HBase > Issue Type: New Feature > Components: Replication >Reporter: Ishan Chhabra >Priority: Minor > > This is to setup NxN replication. > See background discussion here: > http://mail-archives.apache.org/mod_mbox/hbase-user/201311.mbox/%3CCAOiuM-4UMmLA7UHMp4hhjpLWUrHDxg1t4tN4aWvnZUMcTxG%2BKQ%40mail.gmail.com%3E > We can add a new mode in replication to not forward edits from other > clusters. Not sure what should be done when some clusters are configured with > this setting and some aren't. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9951) Tags for HLogKey
Ishan Chhabra created HBASE-9951: Summary: Tags for HLogKey Key: HBASE-9951 URL: https://issues.apache.org/jira/browse/HBASE-9951 Project: HBase Issue Type: New Feature Components: wal Reporter: Ishan Chhabra Similar to the Cell Interface, adding tags for the HLogKey could be useful for multiple scenarios. My primary use cases are driven from replication though: 1. To record whether a WALEdit should be forwarded further to other clusters (see [#HBASE-9934]) 2. To record which clusters the WALEdit should be forwarded to (for Row level replication) 3. To mark a record for not replicating (these are some special cases in our usage and cannot be handled using a separate column family) -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9934) Mesh replication (a.k.a. multi master replication)
[ https://issues.apache.org/jira/browse/HBASE-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819517#comment-13819517 ] Ishan Chhabra commented on HBASE-9934: -- Changed the title to Mesh replication as that describes the feature request better. Multi-master replication may not be the best term but is used by the DBA community for these setups for MySQL or other RDBMS. > Mesh replication (a.k.a. multi master replication) > -- > > Key: HBASE-9934 > URL: https://issues.apache.org/jira/browse/HBASE-9934 > Project: HBase > Issue Type: New Feature > Components: Replication >Reporter: Ishan Chhabra >Priority: Minor > > This is to setup NxN replication. > See background discussion here: > http://mail-archives.apache.org/mod_mbox/hbase-user/201311.mbox/%3CCAOiuM-4UMmLA7UHMp4hhjpLWUrHDxg1t4tN4aWvnZUMcTxG%2BKQ%40mail.gmail.com%3E > We can add a new mode in replication to not forward edits from other > clusters. Not sure what should be done when some clusters are configured with > this setting and some aren't. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9950) Row level replication
Ishan Chhabra created HBASE-9950: Summary: Row level replication Key: HBASE-9950 URL: https://issues.apache.org/jira/browse/HBASE-9950 Project: HBase Issue Type: Bug Components: Replication Reporter: Ishan Chhabra Priority: Minor We have a replication setup with the same table and column family being present in multiple data centers. Currently, all of them have exactly the same data, but each cluster doesn't need all the data. Rows need to be present in only x out of the total y clusters. This information varies at the row level and thus more granular replication cannot be achieved by setting up cluster level replication. Adding row level replication should solve this. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HBASE-9934) Mesh replication (a.k.a. multi master replication)
[ https://issues.apache.org/jira/browse/HBASE-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chhabra updated HBASE-9934: - Summary: Mesh replication (a.k.a. multi master replication) (was: Suport not forwarding edits in replication) > Mesh replication (a.k.a. multi master replication) > -- > > Key: HBASE-9934 > URL: https://issues.apache.org/jira/browse/HBASE-9934 > Project: HBase > Issue Type: New Feature > Components: Replication >Reporter: Ishan Chhabra >Priority: Minor > > This is to setup NxN replication. > See background discussion here: > http://mail-archives.apache.org/mod_mbox/hbase-user/201311.mbox/%3CCAOiuM-4UMmLA7UHMp4hhjpLWUrHDxg1t4tN4aWvnZUMcTxG%2BKQ%40mail.gmail.com%3E > We can add a new mode in replication to not forward edits from other > clusters. Not sure what should be done when some clusters are configured with > this setting and some aren't. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HBASE-9934) Suport not forwarding edits in replication
[ https://issues.apache.org/jira/browse/HBASE-9934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13819329#comment-13819329 ] Ishan Chhabra commented on HBASE-9934: -- I agree. Let me setup a graph of HBase clusters for further discussion. Lets say we have 6 clusters, C1..C6. C1..C3 want to replicate to each other in the NxN fashion (i.e. with no forwarding) and the following master master replications are setup: C4 <-> C3, C5 <-> C3 and C6 <-> C5 C4 <-> C3 <-> C5 <-> C6 /\ C1 - C2 1. The source cluster sets the replication scope for the column family that is being replicated as SINGLE_HOP_ONLY and the targets clusters do not forward KeyValues with families that have their scope set as this. The decision on whether a write should be forwarded ahead thus lies with the source cluster in this case. This will work if we just had C1..C3 but will *not* work in the above example since a write from C4 will not have the correct scope set and will circulate more in C1..C3 than needed. We could say that such a mixed setting is not supported, but it is hard to prevent someone from shooting themselves in their foot. 2. Having thought about this, this looks more like a property of the link (whether it is a part of mesh network or a standard Master-Slave / Master-Master link). If a link is a part of the mesh network, then *all* writes coming from that link (including ones that could have originated at a different cluster) should not be forwarded. To do this setup, we would have to add this as a property for the peer (in the zookeeper state?) and then when an edit is sent across a link, it should be marked as "do not forward". This could be a part of the WALEdit key or we could add support for tags to WALEdit (similar to cells) and add it there. Thoughts? > Suport not forwarding edits in replication > -- > > Key: HBASE-9934 > URL: https://issues.apache.org/jira/browse/HBASE-9934 > Project: HBase > Issue Type: New Feature > Components: Replication >Reporter: Ishan Chhabra >Priority: Minor > > This is to setup NxN replication. > See background discussion here: > http://mail-archives.apache.org/mod_mbox/hbase-user/201311.mbox/%3CCAOiuM-4UMmLA7UHMp4hhjpLWUrHDxg1t4tN4aWvnZUMcTxG%2BKQ%40mail.gmail.com%3E > We can add a new mode in replication to not forward edits from other > clusters. Not sure what should be done when some clusters are configured with > this setting and some aren't. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HBASE-9934) Suport not forwarding edits in replication
Ishan Chhabra created HBASE-9934: Summary: Suport not forwarding edits in replication Key: HBASE-9934 URL: https://issues.apache.org/jira/browse/HBASE-9934 Project: HBase Issue Type: New Feature Components: Replication Reporter: Ishan Chhabra Priority: Minor This is to setup NxN replication. See background discussion here: http://mail-archives.apache.org/mod_mbox/hbase-user/201311.mbox/%3CCAOiuM-4UMmLA7UHMp4hhjpLWUrHDxg1t4tN4aWvnZUMcTxG%2BKQ%40mail.gmail.com%3E We can add a new mode in replication to not forward edits from other clusters. Not sure what should be done when some clusters are configured with this setting and some aren't. -- This message was sent by Atlassian JIRA (v6.1#6144)