ayush1300 commented on code in PR #7834:
URL: https://github.com/apache/hadoop/pull/7834#discussion_r2247805315


##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3aTagging.md:
##########
@@ -0,0 +1,298 @@
+# S3 Object Tagging Support in Hadoop S3A Filesystem
+
+## Overview
+
+The Hadoop S3A filesystem connector now supports S3 object tagging, allowing 
users to automatically assign metadata tags to S3 objects during creation and 
soft deletion operations. This feature enables better data organization, cost 
allocation, access control, and lifecycle management for S3-stored data.
+
+**JIRA Issue**: 
[HADOOP-19536](https://issues.apache.org/jira/browse/HADOOP-19536#s3-tags)
+
+## Table of Contents
+
+- [Motivation](#motivation)
+- [S3 Object Tagging Capabilities](#s3-object-tagging-capabilities)
+- [Use Cases](#use-cases)
+- [Configuration](#configuration)
+- [Usage Examples](#usage-examples)
+- [Soft Delete Feature](#soft-delete-feature)
+- [Best Practices](#best-practices)
+- [Limitations](#limitations)
+
+## Motivation
+
+Amazon S3 supports tagging objects with key-value pairs, providing several 
critical benefits:
+
+1. **Cost Allocation**: Track and allocate S3 storage costs across 
departments, projects, or cost centers
+2. **Access Control**: Use tags in IAM policies to control object access 
permissions
+3. **Lifecycle Management**: Trigger automated lifecycle policies for object 
transitions and expiration
+4. **Data Classification**: Organize and classify data for compliance, 
security, and business requirements
+5. **Analytics and Reporting**: Enable detailed analytics and reporting based 
on object metadata
+
+Previously, the Hadoop S3A connector lacked native support for object tagging, 
requiring users to implement custom solutions or use separate tools to tag 
objects post-creation.
+
+## S3 Object Tagging Capabilities
+
+### Tag Specifications
+- **Maximum Tags**: Up to 10 tags per object
+- **Structure**: Key-value pairs
+- **Key Length**: Up to 128 Unicode characters
+- **Value Length**: Up to 256 Unicode characters
+- **Case Sensitivity**: Keys and values are case-sensitive
+- **Uniqueness**: Tag keys must be unique per object (no duplicate keys)
+
+### Allowed Characters
+Tag keys and values can contain:
+- Letters (a-z, A-Z)
+- Numbers (0-9)
+- Spaces
+- Special symbols: `. : + - = _ / @`
+
+## Use Cases
+
+### 1. Access Control with IAM Policies
+
+Control object access based on tags:
+
+```json
+{
+    "Effect": "Allow",
+    "Action": "s3:GetObject",
+    "Resource": "*",
+    "Condition": {
+        "StringEquals": {
+            "s3:ExistingObjectTag/department": "finance"
+        }
+    }
+}
+```
+
+### 2. Lifecycle Management
+
+Trigger lifecycle rules based on tags:
+
+```json
+{
+    "Rules": [
+        {
+            "Status": "Enabled",
+            "Filter": {
+                "Tag": {
+                    "Key": "retention",
+                    "Value": "temporary"
+                }
+            },
+            "Expiration": {
+                "Days": 30
+            }
+        }
+    ]
+}
+```
+
+### 3. Cost Allocation and Tracking
+
+- Use tags for cost tracking in AWS Cost Explorer
+- Allocate costs across different business units or projects
+- Generate detailed billing reports by tag dimensions
+
+### 4. Data Analytics and Filtering
+
+- Use S3 Analytics to filter and analyze data by tags
+- Create custom reports based on tagged object metadata
+- Enable data governance and compliance reporting
+
+## Configuration
+
+### Object Creation Tags
+
+#### Method 1: Comma-Separated List
+```properties
+fs.s3a.object.tags=department=finance,project=alpha,owner=data-team
+```
+
+#### Method 2: Individual Tag Properties
+```properties
+fs.s3a.object.tag.department=finance
+fs.s3a.object.tag.project=alpha
+fs.s3a.object.tag.owner=data-team
+fs.s3a.object.tag.environment=production
+```
+
+### Soft Delete Tags
+```properties
+fs.s3a.soft.delete.enabled=true

Review Comment:
   1. Yes the object will be tagged according to the tag given by the user or 
some default tag for deletion.
   2. It is for recovery. Users can archive some s3 objects on the basis of 
tags and recover that in future when they need.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to