steveloughran commented on pull request #3240:
URL: https://github.com/apache/hadoop/pull/3240#issuecomment-888247513
Latest release
* Address review comments
* log IOStats after each test case.
Important: as the cached FS retains statistics, the numbers
get bigger over time.
* HDFS test is now reinstated, as we've identified that most
of its long execution time is from the large file upload/download
suites. Disable them and its execution time drops from 4m to 30s,
which means it can then be used to make sure the contract suite
is consistent between HDFS and the object stores.
IOStats of full suite against S3 london (1:43s)
```
2021-07-28 12:40:48,632 [setup] INFO statistics.IOStatisticsLogging
(IOStatisticsLogging.java:logIOStatisticsAtLevel(269)) - IOStatistics:
counters=((action_executor_acquired=47)
(action_http_get_request=38)
(action_http_head_request=111)
(audit_request_execution=420)
(audit_span_creation=483)
(directories_created=38)
(directories_deleted=1)
(fake_directories_deleted=485)
(files_copied=2)
(files_copied_bytes=264)
(files_created=47)
(files_deleted=48)
(ignored_errors=14)
(object_bulk_delete_request=88)
(object_copy_requests=2)
(object_delete_objects=534)
(object_delete_request=5)
(object_list_request=89)
(object_metadata_request=111)
(object_put_bytes=18880752)
(object_put_request=85)
(object_put_request_completed=85)
(op_create=47)
(op_delete=14)
(op_exists=13)
(op_exists.failures=3)
(op_get_file_status=194)
(op_get_file_status.failures=44)
(op_glob_status=25)
(op_is_file=1)
(op_list_files=9)
(op_list_status=60)
(op_mkdirs=64)
(op_open=39)
(op_rename=2)
(s3guard_metadatastore_initialization=1)
(s3guard_metadatastore_put_path_request=103)
(s3guard_metadatastore_record_deletes=2)
(s3guard_metadatastore_record_reads=1473)
(s3guard_metadatastore_record_writes=350)
(store_io_request=422)
(stream_read_bytes=18878052)
(stream_read_close_operations=39)
(stream_read_closed=38)
(stream_read_opened=38)
(stream_read_operations=2742)
(stream_read_operations_incomplete=1639)
(stream_read_seek_policy_changed=39)
(stream_read_total_bytes=18878052)
(stream_write_block_uploads=47)
(stream_write_bytes=18880752)
(stream_write_total_data=37761504));
gauges=((stream_write_block_uploads_pending=47));
minimums=((action_executor_acquired.min=0)
(action_http_get_request.min=31)
(action_http_head_request.min=22)
(object_bulk_delete_request.min=45)
(object_delete_request.min=34)
(object_list_request.min=28)
(object_put_request.min=42)
(op_create.min=16)
(op_delete.min=53)
(op_exists.failures.min=16)
(op_exists.min=15)
(op_get_file_status.failures.min=16)
(op_get_file_status.min=15)
(op_glob_status.min=15)
(op_is_file.min=43)
(op_list_files.min=176)
(op_list_status.min=64)
(op_mkdirs.min=16)
(op_rename.min=967));
maximums=((action_executor_acquired.max=0)
(action_http_get_request.max=123)
(action_http_head_request.max=317)
(object_bulk_delete_request.max=384)
(object_delete_request.max=91)
(object_list_request.max=202)
(object_put_request.max=2083)
(op_create.max=129)
(op_delete.max=2196)
(op_exists.failures.max=45)
(op_exists.max=43)
(op_get_file_status.failures.max=29)
(op_get_file_status.max=341)
(op_glob_status.max=192)
(op_is_file.max=43)
(op_list_files.max=589)
(op_list_status.max=260)
(op_mkdirs.max=729)
(op_rename.max=1199));
means=((action_executor_acquired.mean=(samples=47, sum=0, mean=0.0000))
(action_http_get_request.mean=(samples=38, sum=1490, mean=39.2105))
(action_http_head_request.mean=(samples=111, sum=4311, mean=38.8378))
(object_bulk_delete_request.mean=(samples=88, sum=12810, mean=145.5682))
(object_delete_request.mean=(samples=5, sum=260, mean=52.0000))
(object_list_request.mean=(samples=89, sum=4988, mean=56.0449))
(object_put_request.mean=(samples=85, sum=17463, mean=205.4471))
(op_create.mean=(samples=47, sum=1160, mean=24.6809))
(op_delete.mean=(samples=14, sum=11257, mean=804.0714))
(op_exists.failures.mean=(samples=3, sum=80, mean=26.6667))
(op_exists.mean=(samples=10, sum=250, mean=25.0000))
(op_get_file_status.failures.mean=(samples=44, sum=876, mean=19.9091))
(op_get_file_status.mean=(samples=150, sum=6404, mean=42.6933))
(op_glob_status.mean=(samples=25, sum=1826, mean=73.0400))
(op_is_file.mean=(samples=1, sum=43, mean=43.0000))
(op_list_files.mean=(samples=9, sum=3218, mean=357.5556))
(op_list_status.mean=(samples=60, sum=7084, mean=118.0667))
(op_mkdirs.mean=(samples=64, sum=15375, mean=240.2344))
(op_rename.mean=(samples=2, sum=2166, mean=1083.0000)));
```
IOStats of full suite against AWS cardiff (1:28). That region is about 30
miles away from here, though I don't know how cables are routed across the
Bristol Channel; it'll probably be a bit longer. In contrast, london will be
100-120 miles away, so latency always going to be a bit higher there.
```
2021-07-28 12:43:57,686 INFO [setup]: statistics.IOStatisticsLogging
(IOStatisticsLogging.java:logIOStatisticsAtLevel(269)) - IOStatistics:
counters=((action_http_delete_request=48)
(action_http_delete_request.failures=34)
(action_http_get_request=161)
(action_http_head_request=333)
(action_http_head_request.failures=79)
(action_http_put_request=237)
(bytes_received=18878316)
(bytes_sent=18881016)
(connections_made=779)
(directories_created=71)
(files_created=49)
(get_responses=779)
(op_create=49)
(op_delete=48)
(op_exists=53)
(op_get_file_status=291)
(op_list_status=107)
(op_mkdirs=71)
(op_open=41)
(op_rename=22)
(send_requests=237));
gauges=();
minimums=((action_http_delete_request.failures.min=21)
(action_http_delete_request.min=31)
(action_http_get_request.min=21)
(action_http_head_request.failures.min=20)
(action_http_head_request.min=19)
(action_http_put_request.min=23));
maximums=((action_http_delete_request.failures.max=332)
(action_http_delete_request.max=146)
(action_http_get_request.max=2193)
(action_http_head_request.failures.max=262)
(action_http_head_request.max=822)
(action_http_put_request.max=3370));
means=((action_http_delete_request.failures.mean=(samples=34, sum=1901,
mean=55.9118))
(action_http_delete_request.mean=(samples=14, sum=744, mean=53.1429))
(action_http_get_request.mean=(samples=161, sum=15025, mean=93.3230))
(action_http_head_request.failures.mean=(samples=79, sum=3668, mean=46.4304))
(action_http_head_request.mean=(samples=254, sum=9391, mean=36.9724))
(action_http_put_request.mean=(samples=237, sum=27099, mean=114.3418)));
```
ABFS is collecting many fewer stats, we really need
* duration of all FS API calls
* LIST performance numbers should be split from GET calls, which they
currently aren't.
Really interesting there that HEAD -> 404 has a mean time of 46ms; HEAD to
200 of 36 millis.
There's always going to be some probes before creating files, dirs, so that
negative cost
is going to be visible for those operations.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]