[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2016-01-19 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107843#comment-15107843
 ] 

Enis Soztutar commented on HBASE-14457:
---

See HBASE-15131. 

> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
> Attachments: Action in Multiple WAL.pdf, Action in Multiple WAL.pdf
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2016-01-19 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15107839#comment-15107839
 ] 

Enis Soztutar commented on HBASE-14457:
---

This is great. We should enabled Multi-WAL by default with bounded provider and 
4 I think. Let me open a jira. 

> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
> Attachments: Action in Multiple WAL.pdf, Action in Multiple WAL.pdf
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2016-01-12 Thread Sean Busbey (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095513#comment-15095513
 ] 

Sean Busbey commented on HBASE-14457:
-

what version of YCSB did y'all use?

> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
> Attachments: Action in Multiple WAL.pdf, Action in Multiple WAL.pdf
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2016-01-12 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095593#comment-15095593
 ] 

Yu Li commented on HBASE-14457:
---

bq.  What are the units for throughput on the last table?
As Ted explained, it's a Chinese abbreviation for 10k

bq. Was the network ever saturated in the test?
No, from the monitoring data, network peak is less than 140MB/s, should be ok 
for 2Gb network card
Regarding the relatively high average latency, I think it's because there're 16 
column qualifier in our test table (to simulate our specific online scenario)

bq. Were flushes ever the bottleneck?
No, didn't observe blocking update in the test

> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
> Attachments: Action in Multiple WAL.pdf, Action in Multiple WAL.pdf
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2016-01-12 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095481#comment-15095481
 ] 

Yu Li commented on HBASE-14457:
---

Thanks [~tedyu] for help clarify the throughput units, have just updated the 
doc and changed unit to k for better understanding. Also thanks for review the 
doc offline before I upload it here.

> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
> Attachments: Action in Multiple WAL.pdf, Action in Multiple WAL.pdf
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2016-01-12 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095606#comment-15095606
 ] 

Yu Li commented on HBASE-14457:
---

Oh yes, forgot to mention that in doc. The YCSB version is 0.3.1, and the hbase 
version in last test (PCIe SSD) is our 0.98.12 with multiple wal function 
backported. Use 0.98.12 since we need a comparison with our online data, JFYI.

> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
> Attachments: Action in Multiple WAL.pdf, Action in Multiple WAL.pdf
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2016-01-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095276#comment-15095276
 ] 

Enis Soztutar commented on HBASE-14457:
---

Great doc. What is the throughput {{2.7w}} indicate in YCSB results? 

> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
> Attachments: Action in Multiple WAL.pdf
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2016-01-12 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095284#comment-15095284
 ] 

Ted Yu commented on HBASE-14457:


w is short hand for Chinese wan, meaning 10,000

> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
> Attachments: Action in Multiple WAL.pdf
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2016-01-12 Thread Enis Soztutar (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15095295#comment-15095295
 ] 

Enis Soztutar commented on HBASE-14457:
---

bq. w is short hand for Chinese wan, meaning 10,000
TIL. Thanks Ted. 

> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
> Attachments: Action in Multiple WAL.pdf
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2016-01-12 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15094305#comment-15094305
 ] 

Elliott Clark commented on HBASE-14457:
---

Thanks for publishing your tests. These are awesome. We are also running 
multi-wal in production; we saw good results at the time.

What are the units for throughput on the last table?
Was the network ever saturated in the test ? ( The reason I ask is that the 
average latency seems pretty high)
Were flushes ever the bottleneck ? ie did mutates ever block for being over the 
high water mark?




> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
> Attachments: Action in Multiple WAL.pdf
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2015-09-21 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901056#comment-14901056
 ] 

Vladimir Rodionov commented on HBASE-14457:
---

I have a use case for multiwal: can we have separates WALs for different column 
families? For example we have two CFs in a table, one accept very high rate 
writes, another is aggregation of the first and very light on a load. I would 
like to have ability to flush these CFs independently from each other and have 
separate periodic flush intervals for them. 

> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2015-09-21 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901095#comment-14901095
 ] 

Yu Li commented on HBASE-14457:
---

I believe we already have the feature of per-CF flush, please refer to 
HBASE-3149 and see whether it meets your requirement. If not, we could discuss 
again. :-)

> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HBASE-14457) Umbrella: Improve Multiple WAL for production usage

2015-09-21 Thread Vladimir Rodionov (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-14457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901156#comment-14901156
 ] 

Vladimir Rodionov commented on HBASE-14457:
---

I am aware about this feature, but it does not help when periodic memstore 
flusher makes its decision based on a global periodic flush interval (one for 
all tables/cf). It would be nice to have different flush intervals for 
different types of column families but this will be possible only when we will 
be able to assign different WALs to them. One of the major reason for periodic 
memstore flush is to guarantee that we do not have runaway wal files.

> Umbrella: Improve Multiple WAL for production usage
> ---
>
> Key: HBASE-14457
> URL: https://issues.apache.org/jira/browse/HBASE-14457
> Project: HBase
>  Issue Type: Umbrella
>Reporter: Yu Li
>Assignee: Yu Li
> Fix For: 2.0.0, 1.3.0
>
>
> HBASE-5699 proposed the idea to run with multiple WAL in regionserver and did 
> a great initial work there, but when trying to use it in our production 
> cluster, we still found several issues to resolve, like tracking multiple WAL 
> paths in replication (HBASE-6617), fixing UT with multiwal provider 
> (HBASE-14411), introducing a namespace-based strategy for 
> RegionGroupingProvider (HBASE-14456), etc. This is an umbrella including(but 
> not limited of) all these works and efforts to make multiple wal ready for 
> production usage and give user a clear picture about it.
> Besides the developing works done, I'd also like to share some scenarios and 
> testing/online data in this JIRA about our usage/performance of multiple wal, 
> to(hopefully) help people better judge whether to enable multiple wal or not 
> in their own cluster and what they might gain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)