SSD/HDD and blob division

Viktor Fri, 04 May 2018 10:50:09 -0700

Hi,

Just had a look at perkeepd as I found the philosophy interesting, and did 
a short test by, in short:


Downloading the repo, checking out release/0.10, runing perkeepd and trying 
to put in some files. Unfortunately the performance was way slower then my 
network, and thus tried locally on the same machine, first with the blob 
storage on a hdd and then on a ssd (index on ssd in both cases)

I figured that this might be of interest for you, I will also try to find 
out what is going on to both learn and see if I can use it :-)

1. When pk-put'ng a file it is initially broken into rather large 
portions/blobs, but towards the end it has degenerated to breaking into 
very small blobs - I can only assume creating performance problems both in 
terms of disk seeks/writes and ram usage. 
2. The write performance is way lower than expected (CPU usage at ~20% 
throughout the pk-put process), generating about 2MB/s write performance 
with blob storage on a  Hdd and about 10MB/s on a Ssd, both significantly 
below the expected performance of the disks.

Is this expected known?

Recreation/case:

Setup:
git clone https://camlistore.googlesource.com/camlistore perkeep.org
git checkout release/0.10
go make.go
perkeepd
(editing config to set index to be on ssd path and blobs to be on hdd and 
ssd path respectively)

Case 1:
----------
dd if=/dev/urandom of=random.txt bs=1024 count=1048576
time pk-put file --permanode --title="test" --tag=backup random.txt

Simultaneously looking at the output from perkeepd I see that the blob size 
seems to be decreasing over time, see e.g. the output of the early blobs 
and late blobs:
Early: (assuming the first 2 are permanodes etc.)
2018/05/04 18:29:47 Received blob 
[sha224-4556e3a2241ea654a80d72d003382f976d864a54fc90b83fef40cf7c; 449 bytes]
2018/05/04 18:29:47 Received blob 
[sha224-3dcc6a1e24475efc417b4ab1b6cfe02ef2d9c085ead96b5a242ed58e; 628 bytes]
2018/05/04 18:29:50 Received blob 
[sha224-0db450c2fbc821a7f3547ca6ad5c8d5971802f2573c5b8d61bade2c8; 68904 
bytes]
2018/05/04 18:29:50 Received blob 
[sha224-425b332fa1ee724e191235a981341db7a25084790a868465e7ed0436; 77478 
bytes]
2018/05/04 18:29:50 Received blob 
[sha224-f5e6e2d09840ab93fc252ff172911f0013ae06c706f43e52b5e5a014; 66016 
bytes]
2018/05/04 18:29:50 Received blob 
[sha224-f3e373b4a528b8ec3734f0408f30d30aba8ca4d743f6b17ba52bda74; 262144 
bytes]
2018/05/04 18:29:50 Received blob 
[sha224-2573b030433ca209ddb5e2aea291946a6f67e5934b6a3a0e1471c8fc; 81458 
bytes]
2018/05/04 18:29:50 Received blob 
[sha224-451b14a507b3e15f44a589be39a34987cab8e403034822367e6a25ab; 68287 
bytes]
2018/05/04 18:29:50 Received blob 
[sha224-45a2a81086cc4497aa0fbded863da10a1a28a0ad6747f44c09c713d3; 69632 
bytes]
Late:
2018/05/04 18:31:18 Received blob 
[sha224-ebdf3a3e7d72f5eaab737a353ba59d97d82cb4fd108131b1b03befa1; 414 bytes]
2018/05/04 18:31:18 Received blob 
[sha224-363b5bcb8b145fc2f29faf0d6d8de7ddb27bf68107ea544290c178ba; 412 bytes]
2018/05/04 18:31:18 Received blob 
[sha224-7b67715d347b97b6768ffbe7f4c9fcfa0fd083f9055e783bd2527750; 646 bytes]
2018/05/04 18:31:18 Received blob 
[sha224-c96a6e7fd1597cee8f347ef0deee45e645d8665ef96ae6372a582dda; 1947 
bytes]
2018/05/04 18:31:18 Received blob 
[sha224-19bce313cda795b30a7161487cdb7be0108dd3b466e92911a5465fb5; 295 bytes]
2018/05/04 18:31:18 Received blob 
[sha224-cde65fa0e356be938d3e395f2f01bab940631df3609007b96094a8d1; 297 bytes]
2018/05/04 18:31:18 Received blob 
[sha224-e80de3ab6d991b62878e078a7d3140a5d7f4717c2fdaabf4757c91b1; 531 bytes]
2018/05/04 18:31:18 Received blob 
[sha224-fe46fa614191ec90c11a5ccf40282c1d210292d7eca297a0d6ec2609; 412 bytes]
2018/05/04 18:31:18 Received blob 
[sha224-d63b69b880326dbd0aa0e73e2f3cf4305bb144db66bb950476603fb1; 1235 
bytes]
2018/05/04 18:31:18 Received blob 
[sha224-7b51468d9cbcec724b70c1accf822cac7bf3c6aacd25876caf80439d; 295 bytes]

Maybe this is expected because some sorting is going on (will dig into the 
code later) - but initially found it a bit strange an a potential 
explanation for 2).

Case 2:
---------
Simply using 1Gb / time for the HDD and SSD case shows about 2MB/s and 
10MB/s, which is significantly lower than I expected (in particular since 
cpu utilization is low so it does not seem to be a sha-hasing problem).


Hopefully this is useful - I will continue to look into it regardless.

/Viktor


-- 
You received this message because you are subscribed to the Google Groups 
"Perkeep" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

SSD/HDD and blob division

Reply via email to