Yes, the write performance is known.

In general, we've ignored spinning disks as the target of initially
incoming blobs since SSDs and NVMe continue to get cheaper. The first
write should be to SSDs.

We do have https://github.com/perkeep/perkeep/issues/999 open to track
changing the protocol to allow clients to work with the server to do
more efficient batching.

For read performance, the "blobpacked" storage format rearranges all
those little blobs into one large contiguous zip file on disk, so read
performance later is very fast, streaming contiguously from disk. So
it's only writes that are slow.

There is also the "cond" storage type to route schema blobs & data
blobs differently.

Note that once we change the config file format
(https://github.com/perkeep/perkeep/issues/1134), it'll be much easier
to configure all the wrapper storage targets into arbitrary graphs.
Currently the low-level JSON config for that is tedious.

For now I recommend you use store all blobs by default on SSD, but put
your "blobpacked" storage on spinning media, which will be like 95% of
your stored bytes. Again, the current config file format doesn't make
that easy, but it'll be trivial (and with documented examples) to do
that soon.




On Fri, May 4, 2018 at 9:58 AM, Viktor <[email protected]> wrote:
> Hi,
>
> Just had a look at perkeepd as I found the philosophy interesting, and did a
> short test by, in short:
>
> Downloading the repo, checking out release/0.10, runing perkeepd and trying
> to put in some files. Unfortunately the performance was way slower then my
> network, and thus tried locally on the same machine, first with the blob
> storage on a hdd and then on a ssd (index on ssd in both cases)
>
> I figured that this might be of interest for you, I will also try to find
> out what is going on to both learn and see if I can use it :-)
>
> 1. When pk-put'ng a file it is initially broken into rather large
> portions/blobs, but towards the end it has degenerated to breaking into very
> small blobs - I can only assume creating performance problems both in terms
> of disk seeks/writes and ram usage.
> 2. The write performance is way lower than expected (CPU usage at ~20%
> throughout the pk-put process), generating about 2MB/s write performance
> with blob storage on a  Hdd and about 10MB/s on a Ssd, both significantly
> below the expected performance of the disks.
>
> Is this expected known?
>
> Recreation/case:
>
> Setup:
> git clone https://camlistore.googlesource.com/camlistore perkeep.org
> git checkout release/0.10
> go make.go
> perkeepd
> (editing config to set index to be on ssd path and blobs to be on hdd and
> ssd path respectively)
>
> Case 1:
> ----------
> dd if=/dev/urandom of=random.txt bs=1024 count=1048576
> time pk-put file --permanode --title="test" --tag=backup random.txt
>
> Simultaneously looking at the output from perkeepd I see that the blob size
> seems to be decreasing over time, see e.g. the output of the early blobs and
> late blobs:
> Early: (assuming the first 2 are permanodes etc.)
> 2018/05/04 18:29:47 Received blob
> [sha224-4556e3a2241ea654a80d72d003382f976d864a54fc90b83fef40cf7c; 449 bytes]
> 2018/05/04 18:29:47 Received blob
> [sha224-3dcc6a1e24475efc417b4ab1b6cfe02ef2d9c085ead96b5a242ed58e; 628 bytes]
> 2018/05/04 18:29:50 Received blob
> [sha224-0db450c2fbc821a7f3547ca6ad5c8d5971802f2573c5b8d61bade2c8; 68904
> bytes]
> 2018/05/04 18:29:50 Received blob
> [sha224-425b332fa1ee724e191235a981341db7a25084790a868465e7ed0436; 77478
> bytes]
> 2018/05/04 18:29:50 Received blob
> [sha224-f5e6e2d09840ab93fc252ff172911f0013ae06c706f43e52b5e5a014; 66016
> bytes]
> 2018/05/04 18:29:50 Received blob
> [sha224-f3e373b4a528b8ec3734f0408f30d30aba8ca4d743f6b17ba52bda74; 262144
> bytes]
> 2018/05/04 18:29:50 Received blob
> [sha224-2573b030433ca209ddb5e2aea291946a6f67e5934b6a3a0e1471c8fc; 81458
> bytes]
> 2018/05/04 18:29:50 Received blob
> [sha224-451b14a507b3e15f44a589be39a34987cab8e403034822367e6a25ab; 68287
> bytes]
> 2018/05/04 18:29:50 Received blob
> [sha224-45a2a81086cc4497aa0fbded863da10a1a28a0ad6747f44c09c713d3; 69632
> bytes]
> Late:
> 2018/05/04 18:31:18 Received blob
> [sha224-ebdf3a3e7d72f5eaab737a353ba59d97d82cb4fd108131b1b03befa1; 414 bytes]
> 2018/05/04 18:31:18 Received blob
> [sha224-363b5bcb8b145fc2f29faf0d6d8de7ddb27bf68107ea544290c178ba; 412 bytes]
> 2018/05/04 18:31:18 Received blob
> [sha224-7b67715d347b97b6768ffbe7f4c9fcfa0fd083f9055e783bd2527750; 646 bytes]
> 2018/05/04 18:31:18 Received blob
> [sha224-c96a6e7fd1597cee8f347ef0deee45e645d8665ef96ae6372a582dda; 1947
> bytes]
> 2018/05/04 18:31:18 Received blob
> [sha224-19bce313cda795b30a7161487cdb7be0108dd3b466e92911a5465fb5; 295 bytes]
> 2018/05/04 18:31:18 Received blob
> [sha224-cde65fa0e356be938d3e395f2f01bab940631df3609007b96094a8d1; 297 bytes]
> 2018/05/04 18:31:18 Received blob
> [sha224-e80de3ab6d991b62878e078a7d3140a5d7f4717c2fdaabf4757c91b1; 531 bytes]
> 2018/05/04 18:31:18 Received blob
> [sha224-fe46fa614191ec90c11a5ccf40282c1d210292d7eca297a0d6ec2609; 412 bytes]
> 2018/05/04 18:31:18 Received blob
> [sha224-d63b69b880326dbd0aa0e73e2f3cf4305bb144db66bb950476603fb1; 1235
> bytes]
> 2018/05/04 18:31:18 Received blob
> [sha224-7b51468d9cbcec724b70c1accf822cac7bf3c6aacd25876caf80439d; 295 bytes]
>
> Maybe this is expected because some sorting is going on (will dig into the
> code later) - but initially found it a bit strange an a potential
> explanation for 2).
>
> Case 2:
> ---------
> Simply using 1Gb / time for the HDD and SSD case shows about 2MB/s and
> 10MB/s, which is significantly lower than I expected (in particular since
> cpu utilization is low so it does not seem to be a sha-hasing problem).
>
>
> Hopefully this is useful - I will continue to look into it regardless.
>
> /Viktor
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Perkeep" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Perkeep" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to