Shaun,
 
have you had a look at the following section of the knowledge centre?
 
https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.0/com.ibm.spectrum.scale.v5r00.doc/bl1adm_compression.htm#compression__sec_updates
 
I think the answer to your question is going to depend on a number of factors:
 
1) number of blocks the file takes up - if its less than 10 then it is contained in a single compression group, if its more than 10 it will be dispersed over a number of compression groups, if it is less than 2 blocks it won't be compressed at all.
 
See : Limitations 
File compression processes each compression group within a file independently. A compression group consists of one to 10 consecutive data blocks within a file. If the file contains fewer than 10 data blocks, the whole file is one compression group. If the saving of space for a compression group is less than 10%, file compression does not compress it but skips to the next compression group.
 
2) in terms of deletion that is effectively a write update.. in which case the following would apply.
 
When a compressed file is updated by a write operation, the file system automatically decompresses the region of the file that contains the affected data and sets the illCompressed flag. The file system then makes the update. To recompress the file, run the mmrestripefile command with the -z option, as in the following example: 
 
if the file is dispersed over multiple compression groups then the illcompressed flag will apply on the groups that have been updated, and the next time the compression policy is run they will be recompressed.
 
in terms of performance overhead -- the answer will always be -- it depends on your specific data environment.
 
Regards,
Andrew Beattie
Software Defined Storage  - IT Specialist
Phone: 614-2133-7927
 
 
----- Original message -----
From: Shaun Anderson <[email protected]>
Sent by: [email protected]
To: "[email protected]" <[email protected]>
Cc:
Subject: [gpfsug-discuss] Compression details
Date: Thu, Jul 26, 2018 5:16 AM
 

I've had the question come up about how SS will handle file deletion as well as overhead required for compression using zl4.

 

The two questions I'm looking for answers (or better yet, reference material documenting) to are:

 

1)

-    How is file deletion handled?

 

Is the block containing the compressed file decompressed, the file deleted, and then recompressed? Or is metadata simply updated showing the file is to be deleted? Does Scale run an implicit 'mmchattr --compression no' command?  

 

2)

-    Are there any guidelines on the overhead to plan for in a compressed environment (lz4)?  I'm not seeing any kind of sizing guidance.  This is potentially going to be for an exisitng ESS GL2 system.

 

Any assistance or direction is appreciated.

 

Regards,

SHAUN ANDERSON
STORAGE ARCHITECT
O 208.577.2112
M 214.263.7014
 
NOTICE: This email message and any attachments hereto may contain confidential
information. Any unauthorized review, use, disclosure, or distribution of such
information is prohibited. If you are not the intended recipient, please contact
the sender by reply email and destroy the original message and all copies of it.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
 

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to