Obviously that would be the case for the flat file (no transactions) and
order of any transactions.

So if that is the case also *inside* a transaction, then you are
effectively doing suboperations with a new transactional state per line in
the transaction.

How about restricting transactions to always have the order DDDAAAA ..?
That would help on reversibility as well as you can't then remove triples
added in the same transaction. (Reversibility is just to swap A/D blocks).

Perhaps DDDAAAA ordering could be a restriction only for Reversible
transactions as it could prevent a more "naive" log approach to be used
with transactions..?

On 19 Oct 2016 1:40 pm, "Rob Vesse" <rve...@dotnetrdf.org> wrote:

> I am pretty sure that the intent is that a patch must be read in linear
> order i.e. It is not designed for parallel processing
>
> On 19/10/2016 11:34, "Stian Soiland-Reyes" <st...@apache.org> wrote:
>
>     I had a quick go, and the penalty from gzip with using expanded forms
>     without "R" was negligible (~ 0.1%, a bit higher with no prefixes). It
>     also means you can't process the RDF Patch in a parallel way without
>     preprocessing.  (Same for prefixes).
>
>     Using "R" could also restrict possible compression pattern, for
> instance in :
>
>     A <http://example.com/thingie15>
>     <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>     <http://schema.org/Person> .
>     A <http://example.com/thingie15>
>     <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
>     <http://schema.org/Person> .
>
>     a good compression algorithm might recognize patterns in here like:
>
>      .\nA <http://example.com/thingie
>     > <http://www.w3.org/
>     #type> <http://schema.org/
>
>
>     Using "R" would restrict possible patterns - betting on it recognizing
>     "> .\nA R R" (which sometimes would work well).
>
>
>
>     Can RDF Patch items within a transaction be considered in any order
>     (first all the DELETEs, then all the ADDs), or do they have to be
>     played back linearly?
>
>
>     On 19 October 2016 at 10:57, Rob Vesse <rve...@dotnetrdf.org> wrote:
>     > Yes but ANY is a form of lossy compression. You lost the actual
> details of what was removed. Also it can only be used for removals and
> yields no benefit for additions.
>     >
>     >  On the other hand REPEAT is lossless compression.
>     >
>     >  However if you apply a general-purpose compression like gzip on top
> of the patch you probably get just as good compression without needing any
> special tokens. In my experience repeat is more useful in compact binary
> formats where you can use fewer bytes to encode it then either the term
> itself or a reference to the term in some lookup table.
>     >
>     > On 14/10/2016 17:09, "Andy Seaborne" <a...@apache.org> wrote:
>     >
>     >     These two together seem a bit contradictory.  The advantage of
> ANY, with
>     >     versions, is that it is form of compression.
>     >
>     >
>     >
>     >
>
>
>
>     --
>     Stian Soiland-Reyes
>     http://orcid.org/0000-0001-9842-9718
>
>
>
>
>
>

Reply via email to