Re: [openstack-dev] [Swift] Erasure coding and geo replication

2016-04-20 Thread John Dickinson
There's no significant change with the global EC clusters story in the 2.7 
release. That's something we're discussing next week at the summit.

--John



On 19 Apr 2016, at 22:47, Mark Kirkwood wrote:

> Hi,
>
> Has the release of 2.7 significantly changed the assessment here?
>
> Thanks
>
> Mark
>
> On 15/02/16 23:29, Kota TSUYUZAKI wrote:
>> Hello Mark,
>>
>> AFAIK, a few reasons for that we still are in working progress for erasure 
>> code + geo replication.
>>
 and expect to survive a region outage...

 With that I mind I did some experiments (Liberty swift) and it looks to me 
 like if you have:

 - num_data_frags < num_nodes in (smallest) region

 and:

 - num_parity_frags = num_data_frags


 then having a region fail does not result in service outage.
>>
>> Good point but note that the PyECLib v1.0.7 (pinned to Kilo/Liberty stable) 
>> still have a problem which cannot decode the original data when all feed 
>> fragments are parity frags[1]. (i.e. if set
>> num_parity_frags = num_data frags and then, num_parity_frags comes into 
>> proxy for GET request, it will fail at the decoding) The problem was already 
>> resolved in the PyECLib/liberasurecode at master
>> branch and current swift master has the PyECLib>=1.0.7 dependencies so if 
>> you thought to use the newest Swift, it might be not
>> a matter.
>>
>> In the Swift perspective, I think that we need more tests/discussion for geo 
>> replication around write/read affinity[2] which is geo replication stuff in 
>> Swift itself and performances.
>>
>> For the write/read affinity, actually we didn't consider the affinity 
>> control to simplify the implementation until EC landed into Swift master[3] 
>> so I think it's time to make sure how we can use the
>> affinity control with EC but it's not done yet.
>>
>> For the performance perspective, in my experiments, more parities causes 
>> quite performance degradation[4]. To prevent the degradation, I am working 
>> for the spec which makes duplicated copy from
>> data/parity fragments and spread them out into geo regions.
>>
>> To sumurize, we've not done the work yet but we welcome to discuss and 
>> contribute for EC + geo replication anytime, IMO.
>>
>> Thanks,
>> Kota
>>
>> 1: 
>> https://bitbucket.org/tsg-/liberasurecode/commits/a01b1818c874a65d1d1fb8f11ea441e9d3e18771
>> 2: 
>> http://docs.openstack.org/developer/swift/admin_guide.html#geographically-distributed-clusters
>> 3: 
>> http://docs.openstack.org/developer/swift/overview_erasure_code.html#region-support
>> 4: 
>> https://specs.openstack.org/openstack/swift-specs/specs/in_progress/global_ec_cluster.html
>>
>>
>>
>> (2016/02/15 18:00), Mark Kirkwood wrote:
>>> After looking at:
>>>
>>> https://www.youtube.com/watch?v=9YHvYkcse-k
>>>
>>> I have a question (that follows on from Bruno's) about using erasure coding 
>>> with geo replication.
>>>
>>> Now the example given to show why you could/should not use erasure coding 
>>> with geo replication is somewhat flawed as it is immediately clear that you 
>>> cannot set:
>>>
>>> - num_data_frags > num_devices (or nodes) in a region
>>>
>>> and expect to survive a region outage...
>>>
>>> With that I mind I did some experiments (Liberty swift) and it looks to me 
>>> like if you have:
>>>
>>> - num_data_frags < num_nodes in (smallest) region
>>>
>>> and:
>>>
>>> - num_parity_frags = num_data_frags
>>>
>>>
>>> then having a region fail does not result in service outage.
>>>
>>> So my real question is - it looks like it *is* possible to use erasure 
>>> coding in geo replicated situations - however I may well be missing 
>>> something significant, so I'd love some clarification here [1]!
>>>
>>> Cheers
>>>
>>> Mark
>>>
>>> [1] Reduction is disk usage and net traffic looks attractive
>>>
>>> __
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>
>>
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


signature.asc
Description: OpenPGP digital signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Swift] Erasure coding and geo replication

2016-04-19 Thread Mark Kirkwood

Hi,

Has the release of 2.7 significantly changed the assessment here?

Thanks

Mark

On 15/02/16 23:29, Kota TSUYUZAKI wrote:

Hello Mark,

AFAIK, a few reasons for that we still are in working progress for erasure code 
+ geo replication.


and expect to survive a region outage...

With that I mind I did some experiments (Liberty swift) and it looks to me like 
if you have:

- num_data_frags < num_nodes in (smallest) region

and:

- num_parity_frags = num_data_frags


then having a region fail does not result in service outage.


Good point but note that the PyECLib v1.0.7 (pinned to Kilo/Liberty stable) 
still have a problem which cannot decode the original data when all feed 
fragments are parity frags[1]. (i.e. if set
num_parity_frags = num_data frags and then, num_parity_frags comes into proxy 
for GET request, it will fail at the decoding) The problem was already resolved 
in the PyECLib/liberasurecode at master
branch and current swift master has the PyECLib>=1.0.7 dependencies so if you 
thought to use the newest Swift, it might be not
a matter.

In the Swift perspective, I think that we need more tests/discussion for geo 
replication around write/read affinity[2] which is geo replication stuff in 
Swift itself and performances.

For the write/read affinity, actually we didn't consider the affinity control 
to simplify the implementation until EC landed into Swift master[3] so I think 
it's time to make sure how we can use the
affinity control with EC but it's not done yet.

For the performance perspective, in my experiments, more parities causes quite 
performance degradation[4]. To prevent the degradation, I am working for the 
spec which makes duplicated copy from
data/parity fragments and spread them out into geo regions.

To sumurize, we've not done the work yet but we welcome to discuss and 
contribute for EC + geo replication anytime, IMO.

Thanks,
Kota

1: 
https://bitbucket.org/tsg-/liberasurecode/commits/a01b1818c874a65d1d1fb8f11ea441e9d3e18771
2: 
http://docs.openstack.org/developer/swift/admin_guide.html#geographically-distributed-clusters
3: 
http://docs.openstack.org/developer/swift/overview_erasure_code.html#region-support
4: 
https://specs.openstack.org/openstack/swift-specs/specs/in_progress/global_ec_cluster.html



(2016/02/15 18:00), Mark Kirkwood wrote:

After looking at:

https://www.youtube.com/watch?v=9YHvYkcse-k

I have a question (that follows on from Bruno's) about using erasure coding 
with geo replication.

Now the example given to show why you could/should not use erasure coding with 
geo replication is somewhat flawed as it is immediately clear that you cannot 
set:

- num_data_frags > num_devices (or nodes) in a region

and expect to survive a region outage...

With that I mind I did some experiments (Liberty swift) and it looks to me like 
if you have:

- num_data_frags < num_nodes in (smallest) region

and:

- num_parity_frags = num_data_frags


then having a region fail does not result in service outage.

So my real question is - it looks like it *is* possible to use erasure coding 
in geo replicated situations - however I may well be missing something 
significant, so I'd love some clarification here [1]!

Cheers

Mark

[1] Reduction is disk usage and net traffic looks attractive

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev








__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Swift] Erasure coding and geo replication

2016-02-15 Thread Mark Kirkwood

On 16/02/16 17:10, Mark Kirkwood wrote:

On 15/02/16 23:29, Kota TSUYUZAKI wrote:

Hello Mark,

AFAIK, a few reasons for that we still are in working progress for
erasure code + geo replication.


and expect to survive a region outage...

With that I mind I did some experiments (Liberty swift) and it looks
to me like if you have:

- num_data_frags < num_nodes in (smallest) region

and:

- num_parity_frags = num_data_frags


then having a region fail does not result in service outage.


Good point but note that the PyECLib v1.0.7 (pinned to Kilo/Liberty
stable) still have a problem which cannot decode the original data
when all feed fragments are parity frags[1]. (i.e. if set
num_parity_frags = num_data frags and then, num_parity_frags comes
into proxy for GET request, it will fail at the decoding) The problem
was already resolved in the PyECLib/liberasurecode at master
branch and current swift master has the PyECLib>=1.0.7 dependencies so
if you thought to use the newest Swift, it might be not
a matter.



Ah right, in my testing I always took down my "1st" region...which will
have had data fragments therein. For interest I'll try to provoke a
situation where I have all parity ones to assemble (and see what happens).




So tried this out - still works fine. Checking version of pyeclib I see 
Ubuntu 15.10 is giving me:


- Swift 2.5.0
- pyeclib 1.0.8

Hmmm - Canonical deliberately upping the version of pyeclib (shock)? 
Interesting...anyway explains why I cannot get it to fail. However, all 
your other points are noted, and again thanks!


Regards

Mark


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Swift] Erasure coding and geo replication

2016-02-15 Thread Mark Kirkwood

On 15/02/16 23:29, Kota TSUYUZAKI wrote:

Hello Mark,

AFAIK, a few reasons for that we still are in working progress for erasure code 
+ geo replication.


and expect to survive a region outage...

With that I mind I did some experiments (Liberty swift) and it looks to me like 
if you have:

- num_data_frags < num_nodes in (smallest) region

and:

- num_parity_frags = num_data_frags


then having a region fail does not result in service outage.


Good point but note that the PyECLib v1.0.7 (pinned to Kilo/Liberty stable) 
still have a problem which cannot decode the original data when all feed 
fragments are parity frags[1]. (i.e. if set
num_parity_frags = num_data frags and then, num_parity_frags comes into proxy 
for GET request, it will fail at the decoding) The problem was already resolved 
in the PyECLib/liberasurecode at master
branch and current swift master has the PyECLib>=1.0.7 dependencies so if you 
thought to use the newest Swift, it might be not
a matter.



Ah right, in my testing I always took down my "1st" region...which will 
have had data fragments therein. For interest I'll try to provoke a 
situation where I have all parity ones to assemble (and see what happens).




In the Swift perspective, I think that we need more tests/discussion for geo 
replication around write/read affinity[2] which is geo replication stuff in 
Swift itself and performances.

For the write/read affinity, actually we didn't consider the affinity control 
to simplify the implementation until EC landed into Swift master[3] so I think 
it's time to make sure how we can use the
affinity control with EC but it's not done yet.

For the performance perspective, in my experiments, more parities causes quite 
performance degradation[4]. To prevent the degradation, I am working for the 
spec which makes duplicated copy from
data/parity fragments and spread them out into geo regions.

To sumurize, we've not done the work yet but we welcome to discuss and 
contribute for EC + geo replication anytime, IMO.

Thanks,
Kota

1: 
https://bitbucket.org/tsg-/liberasurecode/commits/a01b1818c874a65d1d1fb8f11ea441e9d3e18771
2: 
http://docs.openstack.org/developer/swift/admin_guide.html#geographically-distributed-clusters
3: 
http://docs.openstack.org/developer/swift/overview_erasure_code.html#region-support
4: 
https://specs.openstack.org/openstack/swift-specs/specs/in_progress/global_ec_cluster.html





Excellent - thank you for a very comprehensive answer.

Regards

Mark



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Swift] Erasure coding and geo replication

2016-02-15 Thread Kota TSUYUZAKI
Hello Mark,

AFAIK, a few reasons for that we still are in working progress for erasure code 
+ geo replication.

>> and expect to survive a region outage...
>>
>> With that I mind I did some experiments (Liberty swift) and it looks to me 
>> like if you have:
>>
>> - num_data_frags < num_nodes in (smallest) region
>>
>> and:
>>
>> - num_parity_frags = num_data_frags
>>
>>
>> then having a region fail does not result in service outage.

Good point but note that the PyECLib v1.0.7 (pinned to Kilo/Liberty stable) 
still have a problem which cannot decode the original data when all feed 
fragments are parity frags[1]. (i.e. if set
num_parity_frags = num_data frags and then, num_parity_frags comes into proxy 
for GET request, it will fail at the decoding) The problem was already resolved 
in the PyECLib/liberasurecode at master
branch and current swift master has the PyECLib>=1.0.7 dependencies so if you 
thought to use the newest Swift, it might be not
a matter.

In the Swift perspective, I think that we need more tests/discussion for geo 
replication around write/read affinity[2] which is geo replication stuff in 
Swift itself and performances.

For the write/read affinity, actually we didn't consider the affinity control 
to simplify the implementation until EC landed into Swift master[3] so I think 
it's time to make sure how we can use the
affinity control with EC but it's not done yet.

For the performance perspective, in my experiments, more parities causes quite 
performance degradation[4]. To prevent the degradation, I am working for the 
spec which makes duplicated copy from
data/parity fragments and spread them out into geo regions.

To sumurize, we've not done the work yet but we welcome to discuss and 
contribute for EC + geo replication anytime, IMO.

Thanks,
Kota

1: 
https://bitbucket.org/tsg-/liberasurecode/commits/a01b1818c874a65d1d1fb8f11ea441e9d3e18771
2: 
http://docs.openstack.org/developer/swift/admin_guide.html#geographically-distributed-clusters
3: 
http://docs.openstack.org/developer/swift/overview_erasure_code.html#region-support
4: 
https://specs.openstack.org/openstack/swift-specs/specs/in_progress/global_ec_cluster.html



(2016/02/15 18:00), Mark Kirkwood wrote:
> After looking at:
> 
> https://www.youtube.com/watch?v=9YHvYkcse-k
> 
> I have a question (that follows on from Bruno's) about using erasure coding 
> with geo replication.
> 
> Now the example given to show why you could/should not use erasure coding 
> with geo replication is somewhat flawed as it is immediately clear that you 
> cannot set:
> 
> - num_data_frags > num_devices (or nodes) in a region
> 
> and expect to survive a region outage...
> 
> With that I mind I did some experiments (Liberty swift) and it looks to me 
> like if you have:
> 
> - num_data_frags < num_nodes in (smallest) region
> 
> and:
> 
> - num_parity_frags = num_data_frags
> 
> 
> then having a region fail does not result in service outage.
> 
> So my real question is - it looks like it *is* possible to use erasure coding 
> in geo replicated situations - however I may well be missing something 
> significant, so I'd love some clarification here [1]!
> 
> Cheers
> 
> Mark
> 
> [1] Reduction is disk usage and net traffic looks attractive
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 


-- 
--
Kota Tsuyuzaki(露﨑 浩太)  
NTT Software Innovation Center
Cloud Solution Project
Phone  0422-59-2837
Fax0422-59-2965
---



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Swift] Erasure coding and geo replication

2016-02-15 Thread Mark Kirkwood

After looking at:

https://www.youtube.com/watch?v=9YHvYkcse-k

I have a question (that follows on from Bruno's) about using erasure 
coding with geo replication.


Now the example given to show why you could/should not use erasure 
coding with geo replication is somewhat flawed as it is immediately 
clear that you cannot set:


- num_data_frags > num_devices (or nodes) in a region

and expect to survive a region outage...

With that I mind I did some experiments (Liberty swift) and it looks to 
me like if you have:


- num_data_frags < num_nodes in (smallest) region

and:

- num_parity_frags = num_data_frags


then having a region fail does not result in service outage.

So my real question is - it looks like it *is* possible to use erasure 
coding in geo replicated situations - however I may well be missing 
something significant, so I'd love some clarification here [1]!


Cheers

Mark

[1] Reduction is disk usage and net traffic looks attractive

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev