steveloughran commented on pull request #2530: URL: https://github.com/apache/hadoop/pull/2530#issuecomment-758146786
Next PR to come in * tries to address all review comments * adds stats gathering * adds almost all the AWS headers (everything but some of the encryption stuff) as XAttrs to be listed. Log of a test run on a newly created file on a bucket with SSE-S3 and versioning. ``` 2021-01-11 18:32:24,009 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Cache-Control has bytes[0] => "" 2021-01-11 18:32:24,009 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Disposition has bytes[0] => "" 2021-01-11 18:32:24,009 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Encoding has bytes[0] => "" 2021-01-11 18:32:24,009 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Language has bytes[0] => "" 2021-01-11 18:32:24,010 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Length has bytes[1] => "0" 2021-01-11 18:32:24,010 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-MD5 has bytes[0] => "" 2021-01-11 18:32:24,010 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Range has bytes[0] => "" 2021-01-11 18:32:24,011 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Type has bytes[24] => "application/octet-stream" 2021-01-11 18:32:24,011 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.ETag has bytes[32] => "d41d8cd98f00b204e9800998ecf8427e" 2021-01-11 18:32:24,011 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Last-Modified has bytes[28] => "Mon Jan 11 18:32:24 GMT 2021" 2021-01-11 18:32:24,012 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-archive-status has bytes[0] => "" 2021-01-11 18:32:24,012 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-object-lock-legal-hold has bytes[0] => "" 2021-01-11 18:32:24,013 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-object-lock-mode has bytes[0] => "" 2021-01-11 18:32:24,014 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-object-lock-retain-until-date has bytes[0] => "" 2021-01-11 18:32:24,014 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-replication-status has bytes[0] => "" 2021-01-11 18:32:24,014 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-server-side-encryption has bytes[6] => "AES256" 2021-01-11 18:32:24,015 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-storage-class has bytes[0] => "" 2021-01-11 18:32:24,015 [JUnit-testXAttrFileCost] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-version-id has bytes[32] => "XeajHuYbsD1rO.Bh.6UKqnqVMCZvkWg1" ``` And for the curious, that of / ``` 2021-01-11 18:32:21,207 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Cache-Control has bytes[0] => "" 2021-01-11 18:32:21,208 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Disposition has bytes[0] => "" 2021-01-11 18:32:21,208 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Encoding has bytes[0] => "" 2021-01-11 18:32:21,208 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Language has bytes[0] => "" 2021-01-11 18:32:21,208 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Length has bytes[1] => "0" 2021-01-11 18:32:21,209 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-MD5 has bytes[0] => "" 2021-01-11 18:32:21,210 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Range has bytes[0] => "" 2021-01-11 18:32:21,210 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Content-Type has bytes[15] => "application/xml" 2021-01-11 18:32:21,213 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.ETag has bytes[0] => "" 2021-01-11 18:32:21,213 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.Last-Modified has bytes[0] => "" 2021-01-11 18:32:21,213 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-archive-status has bytes[0] => "" 2021-01-11 18:32:21,213 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-object-lock-legal-hold has bytes[0] => "" 2021-01-11 18:32:21,213 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-object-lock-mode has bytes[0] => "" 2021-01-11 18:32:21,214 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-object-lock-retain-until-date has bytes[0] => "" 2021-01-11 18:32:21,214 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-replication-status has bytes[0] => "" 2021-01-11 18:32:21,214 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-server-side-encryption has bytes[0] => "" 2021-01-11 18:32:21,215 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-storage-class has bytes[0] => "" 2021-01-11 18:32:21,215 [JUnit-testXAttrRoot] INFO impl.ITestXAttrCost (ITestXAttrCost.java:lambda$logXAttrs$2(78)) - header.x-amz-version-id has bytes[0] => "" ``` And the stats of that test run ``` 2021-01-11 18:32:26,484 [JUnit] INFO s3a.AbstractS3ATestBase (AbstractS3ATestBase.java:dumpFileSystemIOStatistics(99)) - Aggregate FileSystem Statistics counters=((action_executor_acquired=1) (action_http_head_request=11) (directories_created=2) (directories_deleted=1) (fake_directories_deleted=1) (files_created=1) (files_deleted=1) (object_bulk_delete_request=2) (object_delete_objects=3) (object_delete_request=1) (object_list_request=8) (object_metadata_request=11) (object_put_request=3) (object_put_request_completed=3) (op_create=1) (op_delete=2) (op_exists=2) (op_get_file_status=2) (op_list_files=1) (op_mkdirs=2) (op_xattr_get_map=2) (op_xattr_get_named=1) (op_xattr_list=2) (stream_write_block_uploads=1)); gauges=(); minimums=((action_executor_acquired.min=0) (action_http_head_request.min=16) (object_bulk_delete_request.min=46) (object_delete_request.min=37) (object_list_request.min=33) (op_xattr_get_map.min=17) (op_xattr_get_named.min=36) (op_xattr_list.min=18)); maximums=((action_executor_acquired.max=0) (action_http_head_request.max=903) (object_bulk_delete_request.max=97) (object_delete_request.max=37) (object_list_request.max=934) (op_xattr_get_map.max=33) (op_xattr_get_named.max=36) (op_xattr_list.max=22)); means=((action_executor_acquired.mean=(samples=1, sum=0, mean=0.0000)) (action_http_head_request.mean=(samples=11, sum=1279, mean=116.2727)) (object_bulk_delete_request.mean=(samples=2, sum=143, mean=71.5000)) (object_delete_request.mean=(samples=1, sum=37, mean=37.0000)) (object_list_request.mean=(samples=8, sum=5094, mean=636.7500)) (op_xattr_get_map.mean=(samples=2, sum=50, mean=25.0000)) (op_xattr_get_named.mean=(samples=1, sum=36, mean=36.0000)) (op_xattr_list.mean=(samples=2, sum=40, mean=20.0000))); ``` @sunchao XAttr calls seem to take ~25 milliseconds over a long haul link. If someone used getXAttr(name) one by one to list attributes, it'd be expensive. If they did the getXAttrs to get multiple attributes then the performance is better. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
