I've been using container pools for a while on one of our prod servers (built
with container pools) and in test, but in the last few months since upgrading
to 8.1.5 on all of my servers I created a new container pools on all of the
servers and switched everything over to backup to the new pools. I've done some
conversions where practical, but I'm doing some of that through attrition. The
conversion process is fairly painless and can be stopped and restarted.
So far I'm really happy with the performance of container pools, and it's a ton
easier to manage.
I have a couple of questions for the more experienced container pool users.
1. As mentioned, exports don't work with container pools. I still need to move
nodes around between servers occasionally. I have 3 levels of service. The
bottom tier is for archive and data is stored on tape only, which is the
cheapest and slowest. Our customers will often decommission a server but want
to keep the backup data for some of amount of time. For those people we used to
export them to the archive server. We can't do that anymore, which is a bit of
a problem. It seems like the only way to do this now is with client
replication. We're not using client replication for anything else, but it seems
a bit clunky since the a TSM server can only have one replication target for
all nodes on the server. It would be pretty much impossible for people who
actually do client replication. Is there another way to accomplish this?
2. I've got a support case open with IBM on this, but we're kind of going in
circles. This is only happening on 1 of 8 servers that use container pools.
For my container directories I'm using 2 TB AIX/JFS2 file systems running off
fibre channel connected NetApps.
It often fills those file systems right to the brim with 0 bytes free reported
from df, which it seems to be Ok most of the time. In the last couple weeks I
started getting some errors like this:
Sep 26, 2018, 2:15:46 PM ANR0204I The container state for
/bucky1dc011/5a/0000000000005a8a.dcf is updated from AVAILABLE to UNAVAILABLE.
(PROCESS: 138)
Sep 26, 2018, 2:15:46 PM ANR3660E An unexpected error occured while opening or
writing to the container. Container /bucky1dc011/5a/0000000000005a8a.dcf in
stgpool DCPOOL has been marked as UNAVAILABLE and should be audited to validate
accessibility and content. (PROCESS: 138)
Sep 26, 2018, 2:15:46 PM ANR0986I Process 138 for Move Container (Automatic)
running in the BACKGROUND processed 26,514 items for a total of 8,165,761,024
bytes with a completion state of WARNING at 14:15:46. (PROCESS: 138)
The file system reports as full:
]$ df /bucky1dc011
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/fslv102 2145386496 2145386496 0 100% /bucky1dc011
So I run an audit on the container. It immediately marks it back as available,
even though the audit is not complete. The audit will complete successfully,
but it's already back to unavailable before it's done:
Sep 26, 2018, 4:58:26 PM ANR4886I Audit Container (Scan) process started for
container /bucky1dc011/5a/0000000000005a8a.dcf (process ID 199). (SESSION:
531830, PROCESS: 199)
Sep 26, 2018, 4:58:26 PM ANR0984I Process 199 for AUDIT CONTAINER (SCAN)
started in the BACKGROUND at 16:58:26. (SESSION: 531830, PROCESS: 199)
Sep 26, 2018, 4:58:26 PM ANR0984I Process 198 for AUDIT CONTAINER started in
the BACKGROUND at 16:58:26. (SESSION: 531830, PROCESS: 198)
Sep 26, 2018, 4:58:27 PM ANR0204I The container state for
/bucky1dc011/5a/0000000000005a8a.dcf is updated from UNAVAILABLE to AVAILABLE.
(SESSION: 531830, PROCESS: 199)
Sep 26, 2018, 4:58:51 PM ANR3660E An unexpected error occured while opening or
writing to the container. Container /bucky1dc011/5a/0000000000005a8a.dcf in
stgpool DCPOOL has been marked as UNAVAILABLE and should be audited to validate
accessibility and content. (PROCESS: 196)
Sep 26, 2018, 5:04:13 PM ANR4891I AUDIT CONTAINER process 199 ended for the
/bucky1dc011/5a/0000000000005a8a.dcf container: 29207 data extents inspected, 0
data extents marked as damaged, 0 data extents previously marked as damaged
reset to undamaged, and 0 data extents marked as orphaned. (SESSION: 531830,
PROCESS: 199)
Sep 26, 2018, 5:04:13 PM ANR0986I Process 199 for AUDIT CONTAINER (SCAN)
running in the BACKGROUND processed 29,207 items for a total of 8,043,383,903
bytes with a completion state of SUCCESS at 17:04:13. (SESSION: 531830,
PROCESS: 199)
Sep 26, 2018, 5:04:13 PM ANR4013I Audit container process 198 completed audit
of 1 containers; 1 successfully audited containers, 0 failed audited
containers. (SESSION: 531830, PROCESS: 199)
Sep 26, 2018, 5:04:13 PM ANR0987I Process 198 for AUDIT CONTAINER running in
the BACKGROUND processed 1 items with a completion state of SUCCESS at
17:04:13. (SESSION: 531830, PROCESS: 199)
tsm: BUCKY1>q container /bucky1dc011/5a/0000000000005a8a.dcf f=d
Container: /bucky1dc011/5a/0000000000005a8a.dcf
Storage Pool Name: DCPOOL
Container Type: Dedup
State: Unavailable
Free Space(MB): 1,879
Maximum Size(MB): 10,104
Approx. Date Last Written: 09/26/2018 16:58:49
Approx. Date Last Audit: 09/26/2018 17:04:13
Cloud Type:
Cloud URL:
Cloud Object Size (MB):
Space Utilized (MB):
Data Extent Count:
It doesn't mark anything as bad, but as soon as something hits it the container
goes back to read only, in this case an automatic container move process hit
it.
A manual move is not successful either:
Sep 26, 2018, 5:23:10 PM ANR0984I Process 215 for Move Container started in the
BACKGROUND at 17:23:09. (SESSION: 531830, PROCESS: 215)
Sep 26, 2018, 5:23:10 PM ANR2088E An I/O error ocurred while reading container
/bucky1dc011/5a/0000000000005a8a.dcf in storage pool DCPOOL. (SESSION: 531830,
PROCESS: 215)
Sep 26, 2018, 5:23:10 PM ANR0985I Process 215 for Move Container running in the
BACKGROUND completed with completion state FAILURE at 17:23:10. (SESSION:
531830, PROCESS: 215)
Sep 26, 2018, 5:23:10 PM ANR1893E Process 215 for Move Container completed with
a completion state of FAILURE. (SESSION: 531830, PROCESS: 215)
After the move it puts the container in read-only state:
tsm: BUCKY1>q container /bucky1dc011/5a/0000000000005a8a.dcf f=d
Container: /bucky1dc011/5a/0000000000005a8a.dcf
Storage Pool Name: DCPOOL
Container Type: Dedup
State: Read-Only
Free Space(MB): 1,881
Maximum Size(MB): 10,104
Approx. Date Last Written: 09/26/2018 16:58:49
Approx. Date Last Audit: 09/26/2018 17:04:13
Cloud Type:
Cloud URL:
Cloud Object Size (MB):
Space Utilized (MB):
Data Extent Count:
If I don't do the move container and just leave it as unavailable, then protect
pool reports warnings.
Maybe someone else has encountered and fixed this problem. If so I'd love to
know what you did.
Thanks!
-Kevin
-----Original Message-----
From: ADSM: Dist Stor Manager <[email protected]> On Behalf Of Alex Jaimes
Sent: Wednesday, September 26, 2018 13:26
To: [email protected]
Subject: Re: [ADSM-L] CONTAINER pool experiences
I echoed Stefan, Rick and Luc... 110%
We've been using the directory-container-pools for about 2 years and work great!
And yes, plan accordingly and monitor the TSM-DB size as you migrate backups to
the container-pools
--Alex
On Wed, Sep 26, 2018 at 7:31 AM Michaud, Luc [Analyste principal -
environnement AIX] <[email protected]> wrote:
> Container pools saved the day here too !
>
> On our legacy environment (TSM717), adding dedup to our seqpools just
> bloated everything, until it became unbearable.
>
> Migrating nodes to the new blueprint replicated servers w/
> directory-container-pools solved a lot of our issues, especially with
> copy-to-tape, as rehydratation is no longer required.
>
> We do have certain apprehensions with limitations for eventually
> migrating from copy-to-tape to copy-to-cloud, but may cheat our way
> across with VTL-type gateways if need be.
>
> Luc
>
>