As a follow-up, I found this very interesting article: http://carloprad.blogspot.it/2014/03/orientdb-on-zfs-performance-analysis.html
The concept (as it relates to disk space usage, which wasn't the primary focus of this analysis) is to essentially move the compression to the file system (ZFS in this case). It also seems to come to the conclusion that ODB's built-in COMPRESSION setting is not very useful. But the file system compression approach may be the best overall solution. In fact, I could envision "aggressive" padding settings (maybe the 2X or more) to leave bigger "virtual holes" at the ODB storage engine level (to prevent record splitting), while leaving the efficient use of physical disk storage to the file system. --Eric On Saturday, May 28, 2016 at 1:13:55 AM UTC-5, scott molinari wrote: > > I can't help much, but I do remember reading that the records are padded > with space. You can find that info here (towards the bottom). > > http://orientdb.com/docs/2.1/plocal-storage-engine.html > > I know this kind of "pre-allocation" technique is necessary to allow for > flexible schema i.e. adding properties to records later on or updating > records with more data than was there before. As I understand the reason > for record "pre-allocation", it is needed because, if the space taken by > the record would be exactly the size of the record, then adding data to it > (making the record size larger) would cause the database to have to move > the record on disk, instead of updating it directly. You can imagine, if > you then update a lot of records this way, you'd end up with a huge mess > fast and the database would slow down considerably. So, in order to avoid > that, the database pre-allocates space per record. ODB has the setting > RECORD_GROW_FACTOR. In MongoDB, they recommend and set as default what they > call "powersOfTwo". In other words, the database doubles the initial size > of the document on disk. This is what is explained in the example in the > docs. > > As I take it from the docs, the settings for record size can be changed > through configuration. If you know your record size will never change, you > could drop the values to "1". However, I could imagine, if you do that and > then you do update and increase the data size even a little in a good > number of records, that will not jive well with the database. Though, I am > no expert on that. > > I'd also like to know the overhead values of the data types otherwise. > Would be great basic knowledge of the database. If one of the nice gents > from Orient would lay it out here, I'd be even glad to add it to the > documentation. It would be a great addition to this table: > http://orientdb.com/docs/latest/Types.html > > Scott > > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
