On 2016-11-18 09:37, Hans van Kranenburg wrote:
Ha,

On 11/18/2016 01:36 PM, Austin S. Hemmelgarn wrote:
On 2016-11-17 16:08, Hans van Kranenburg wrote:
On 11/17/2016 08:27 PM, Austin S. Hemmelgarn wrote:
On 2016-11-17 13:51, Hans van Kranenburg wrote:
But, the fun with visualizations of data is that you learn whether they
just work(tm) or don't as soon as you see them. Mathematical or
algorithmic beauty is not always a good recipe for beauty as seen by the
human eye.

So, let's gather a bunch of ideas which we can try out and then observe
the result.

Before doing so, I'm going to restructure the code a bit more so I can
write another script in the same directory, just doing import heatmap
and calling a few functions in there to quickly try stuff, bypassing the
normal cli api.

Also, the png writing handling is now done by some random png library
that I found, which requires me to build (or copy/resize) an entire
pixel grid in memory, explicitely listing all pixel values, which is a
bit of a memory hog for bigger pictures, so I want to see if something
can be done there also.
I haven't had a chance to look at the code yet, but do you have an
option to control how much data a pixel represents?  On a multi TB
filesystem for example, you may not care about exact data, just an
overall view of the data, in which case making each pixel represent a
larger chunk of data (and thus reducing the resolution of the image)
would almost certainly save some memory on big filesystems.

--order, which defines the hilbert curve order.

Example: for a 238GiB filesystem, when specifying --order 7, then 2**7 =
128, so 128x128 = 16384 pixels, which means that a single one represents
~16MiB

when --size > --order, the image simply gets scaled up.

When not specifying --order, a number gets chosen automatically with
which bytes per pixel is closest to 32MiB.

When size is not specified, it's 10, or same as order if order is
greater than 10.

Now this output should make sense:

-# ./heatmap.py /mnt/238GiB
max_id 1 num_devices 1 fsid ed108358-c746-4e76-a071-3820d423a99d
nodesize 16384 sectorsize 4096 clone_alignment 4096
scope filesystem curve hilbert order 7 size 10 pngfile
fsid_ed10a358-c846-4e76-a071-3821d423a99d_at_1479473532.png
grid height 128 width 128 total_bytes 255057723392 bytes_per_pixel
15567488.0 pixels 16384

-# ./heatmap.py /mnt/40TiB
max_id 2 num_devices 2 fsid 9bc9947e-070f-4bbc-872e-49b2a39b3f7b
nodesize 16384 sectorsize 4096 clone_alignment 4096
scope filesystem curve hilbert order 10 size 10 pngfile
/home/beheer/heatmap/generated/fsid_9bd9947e-070f-4cbc-8e2e-49b3a39b8f7b_at_1479473950.png
grid height 1024 width 1024 total_bytes 46165378727936 bytes_per_pixel
44026736.0 pixels 1048576

OK, here's another thought, is it possible to parse smaller chunks of the image at a time, and then use some external tool (ImageMagick maybe?) to stitch those together into the final image? That might also be useful for other reasons too (If you implement it so you can do arbitrary ranges, you could use it to split separate devices into independent images).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to