I'd like to see some numbers computed from an actual (real) Firebird
database before it is considered.

But why records only over 4k?  And what commonality do you expect to find
on large records?

On Monday, March 16, 2015, Slavomir Skopalik <skopa...@elektlabs.cz> wrote:

> Hi Jim,
> I made some research about storage compresion and I found this project:
>
> https://code.google.com/p/lz4/
>
> My idea is to use this only if encoded size of record will be more than
> aprox 4Kb.
>
> Do you have any note, why it can be bad idea?
>
> Thanks Slavek
>
> PS: I was made some changes in Firebird to rip out compresor from
> storage engine (and put new RLE in second step, encoding in third step),
> but it was rejected by comunity :)
>
> Ing. Slavomir Skopalik
> Executive Head
> Elekt Labs s.r.o.
> Collection and evaluation of data from machines and laboratories
> by means of system MASA (http://www.elektlabs.cz/m2demo)
> -----------------------------------------------------------------
> Address:
> Elekt Labs s.r.o.
> Chaloupky 158
> 783 72 Velky Tynec
> Czech Republic
> ---------------------------------------------------------------
> Mobile: +420 724 207 851
> icq:199 118 333
> skype:skopaliks
> e-mail:skopa...@elektlabs.cz <javascript:;>
> http://www.elektlabs.cz
>
> On 1.3.2015 18:55, Slavomir Skopalik wrote:
> > Hi Jim,
> > my proposal was not so abstract as your.
> >
> > I just want to put all parts of encoding/decoding  into one class with
> > clear interface that will able
> > to put different ecoder in development time (FB3+).
> >
> > I will contact Firebird developer to make agreement about changes is
> > this class:
> >
> > class Compressor : public Firebird::AutoStorage
> >
> > If will be posible to have access in record format, it can be easy to
> create
> > self-described encoding.
> > I have in my mind a idea of this schema that I woudlike to test it.
> >
> > Slavek
> >
> > Ing. Slavomir Skopalik
> > Executive Head
> > Elekt Labs s.r.o.
> > Collection and evaluation of data from machines and laboratories
> > by means of system MASA (http://www.elektlabs.cz/m2demo)
> > -----------------------------------------------------------------
> > Address:
> > Elekt Labs s.r.o.
> > Chaloupky 158
> > 783 72 Velky Tynec
> > Czech Republic
> > ---------------------------------------------------------------
> > Mobile: +420 724 207 851
> > icq:199 118 333
> > skype:skopaliks
> > e-mail:skopa...@elektlabs.cz <javascript:;>
> > http://www.elektlabs.cz
> >
> > On 28.2.2015 22:43, Jim Starkey wrote:
> >> OK, I think I understand what you are trying to do -- and please
> correct me if I'm wrong.  You want to standardize an interface between an
> encoding and DPM, separating the actual encoding/decoding from the
> fragmentation process.  In other words, you want to compress a record in
> toto then let somebody else to chop the resulting byte stream to and from
> data pages.  In essence, this makes the compression scheme plug replaceable.
> >>
> >> If this is your intention, it isn't a bad idea, but it does have
> problems.  The first is how to map a given record to a particular decoding
> schema.  The second, more difficult, is how to do this without bumping the
> ODS (desirable, but not essential).  A third is how to handle encodings
> that are not variations on run length encoding (such as value based
> encoding).
> >>
> >> If I'm on the right track, do note that the current decoding schema
> already fits your bill.  Concatenate the fragments and decode.  The
> encoding process, on the other hand, is more problematic.
> >>
> >> Encoding/decoding in place is more efficient than using a temp, but not
> so much as to preclude it.  I might be wrong, but I doubt that the existing
> schema shows up as a hot spot in a profile.  But that said, I'm far from
> convinced that variations on a run length theme are going to have any
> significant benefit for either density or performance.
> >>
> >> My post Interbase database systems don't access records on page (NuoDB
> doesn't even have pages).  Records have one format in storage and others
> formats in memory within a record class that understands the transitions
> between formats (essentially doing the various encode and decoding).  There
> are generally an encoded form (raw byte stream), a descriptor vector for
> buiding new records, and some sort of ancillary structure for field
> references to either.
> >>
> >> In my mind, I think it would be wiser to Firebird to go with a flexible
> record object than to simply abstract the encoding/decoding process.  More
> code would need to be changed, but when you were done, there would be much
> less code.
> >>
> >> Architecturally, abstracting encoding/decode makes sense, but
> practically, I don't it buys much.  A deep reorganzation, I believe, would
> have a much better long term payoff.
> >>
> >> But then maybe I missed your point...
> >>
> >> Jim Starkey
> >>
> >>
> >>> On Feb 28, 2015, at 10:30 AM, Slavomir Skopalik <skopa...@elektlabs.cz
> <javascript:;>> wrote:
> >>>
> >>> Hi Jim,
> >>> I don't want to change ODS for saving one byte per page.
> >>> I want to change sources to be able implement different
> >>> encoder (put name that you want) -> change ODS.
> >>>
> >>> For some encoder is frangmentation lost 1-2 byte, for another
> >>> can be more.
> >>> For some encoder is easy to do reverse parsing, for some other
> >>> is much more complicated.
> >>>
> >>> For some situation can be generation of control stream benefit,
> >>> but as is now in sources (FB2.5, FB3) that I read, it is not.
> >>>
> >>> Current compressor interface:
> >>> to create control stream:
> >>> ULONG SQZ_length(const SCHAR* data, ULONG length, DataComprControl*
> dcc)
> >>>
> >>> to create final stream from control stream:
> >>> void SQZ_fast(const DataComprControl* dcc, const SCHAR* input, SCHAR*
> >>> output)
> >>>
> >>> To calculate how many bytes can be commpressed into small area (from
> >>> control stream):
> >>> USHORT SQZ_compress_length(const DataComprControl* dcc, const SCHAR*
> >>> input, int space)
> >>>
> >>> To compress into small area:
> >>> USHORT SQZ_compress(const DataComprControl* dcc, const SCHAR* input,
> >>> SCHAR* output, int space)
> >>>
> >>> and decomress:
> >>> UCHAR* SQZ_decompress(const UCHAR*    input,  USHORT        length,
> >>> UCHAR*        output,   const UCHAR* const    output_end)
> >>>
> >>> And some routines is directly in storage code.
> >>>
> >>> In FB3 is very similar (changed names, organized into class, same hack
> >>> in store_big_record(problem is not code itself, but where the code
> is)).
> >>>
> >>> The question is:
> >>> Why keep control stream (worst CPU, litle worst HDD, and also important
> >>> for me - worst readable code)?
> >>> It seems to be, that was implemented this way because of RAM
> limitation.
> >>>
> >>> And another question:
> >>> What functions and parameters have been in new interface?
> >>>
> >>> If you have idea how to use control stream with benefits, please share
> it.
> >>>
> >>> Slavek
> >>>
> >>> BTW: If we drop control stream, posted code will reduce to one movecpy
> >>> that is implemented by SSE+ instructions.
> >>>
> >>>
>
>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for
> all
> things parallel software development, from weekly thought leadership blogs
> to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
> Firebird-Devel mailing list, web interface at
> https://lists.sourceforge.net/lists/listinfo/firebird-devel
>


-- 
Jim Starkey
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to