Justin - is this a good use for bitemporal?

Tim

On Jun 29, 2016, at 5:58 PM, Justin Makeig <[email protected]> wrote:

>  the point-in-time feature does not require that we disable merges.  It just 
> requires that the merge timestamp is set to the earliest point back in time 
> where we want to be able to look back to.

Yes. That's correct. The further you push the merge timestamp back, the more 
your going to stress normal operations, though. Merges allow the database to 
optimize the storage and indexes to support high-performance I/O. They're not a 
nice-to-have, they're a required aspect of how MarkLogic works. The fact that 
you can delay or turn them off is an advanced operation for special cases, such 
as rolling back a database for disaster recovery. 

> What I am still missing is why the "Inside MarkLogic" document describes how 
> MVCC timestamps can be used to implement "Time Travel" and the "Application 
> Developer's Guide" describe point-in-time queries if you (assuming that you 
> speak for MarkLogic) advise against using them.

Point-in-time queries are good for "micro" time travel, if I may coin a term. 
They're good for maintaining a consistent snapshot over a very short period of 
time, within the finite window that you'd configure to not merge. Beyond that 
window, the history is gone—optimized away. If you need to keep that history 
you should do so explicitly in documents. (That's how the Document Library 
Services and Bitemporal APIs maintain version histories.)

> Is the documentation accurate? 

Yes, but a little light on why you'd use point-in-time queries and what the 
boundaries and implications are.

> Under what circumstances do you recommend using the point-in-time technique 
> described in the guide?  Does the point-in-time query technique only work if 
> merges are disabled?

Yes, doing anything at a point in time in the past means that you need to 
maintain the (MVCC) state back to that point. The only way the database can do 
that is by not merging out all of the obsolete fragments. This is OK for finite 
windows of time, but the database needs to eventually merge. (You can give the 
merge timestamp a negative value to maintain a rolling window.)


Justin


--
Justin Makeig
Director, Product Management
MarkLogic
[email protected] <mailto:[email protected]>

> On Jun 29, 2016, at 12:18 PM, Hans Hübner <[email protected] 
> <mailto:[email protected]>> wrote:
> 
> Justin,
> 
> thank you for the additional documentation pointer.  From what I read, I 
> understand that merging is a useful operation and that merges should not be 
> disabled.  I can agree to that, but as far as I have understood, the 
> point-in-time feature does not require that we disable merges.  It just 
> requires that the merge timestamp is set to the earliest point back in time 
> where we want to be able to look back to.  Does setting the merge timestamp 
> automatically disable the merges?
> 
> What I am still missing is why the "Inside MarkLogic" document describes how 
> MVCC timestamps can be used to implement "Time Travel" and the "Application 
> Developer's Guide" describe point-in-time queries if you (assuming that you 
> speak for MarkLogic) advise against using them.  The "Application Developer's 
> Guide" in particular describes how such queries work, in detail, and it does 
> not mention that one should avoid the technique.
> 
> Is the documentation accurate?  Under what circumstances do you recommend 
> using the point-in-time technique described in the guide?  Does the 
> point-in-time query technique only work if merges are disabled?
> 
> Hans
> 
> On Wed, Jun 29, 2016 at 7:40 PM, Justin Makeig <[email protected] 
> <mailto:[email protected]>> wrote:
>> Can you elaborate what you mean by "maintain the health of a database"?  If 
>> we'd decide that we never want to delete any data in a certain MarkLogic 
>> database so that we can roll back to any point in time, what would be the 
>> down sides?  How would  the database become unhealthy?
> 
> Please take a look at the docs on merging, specifically the section, "Merges 
> Are Good" <https://docs.marklogic.com/guide/admin/merges#id_43904 
> <https://docs.marklogic.com/guide/admin/merges#id_43904>>. Merging is the way 
> that MarkLogic manages its internal data to support efficient and consistent 
> ingest and query I/O. It is an internal process and completely orthogonal to 
> how you version your documents. 
> 
> What you describe sounds more like temporal versioning. Please take a look at 
> MarkLogic's bitemporal APIs <https://docs.marklogic.com/guide/temporal/intro 
> <https://docs.marklogic.com/guide/temporal/intro>>. With bitemporal 
> management you maintain an immutable copy of the entire history of your data 
> that you can query at any point in time. The APIs do all of the sophisticated 
> work maintaining versions securely. The "bi" in bitemporal allows you to 
> query the valid time of the document (e.g. a trade was effective on 
> 2016-06-01) as you knew it at any point in time (e.g. the trade wasn't 
> recorded until 2016-06-02 and then it was corrected on 2016-06-05). 
> 
> Justin
> 
> 
>> On Jun 28, 2016, at 9:55 PM, Hans Hübner <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> On Tue, Jun 28, 2016 at 10:36 PM, Justin Makeig <[email protected] 
>> <mailto:[email protected]>> wrote:
>> > as we want to be able to use the point-in-time query feature to track 
>> > document changes over time
>> 
>> Point-in-time queries 
>> <https://docs.marklogic.com/guide/app-dev/point_in_time 
>> <https://docs.marklogic.com/guide/app-dev/point_in_time>> are not designed 
>> for versioning, as I think you're describing it. The timestamps are internal 
>> bookkeeping. (Think of them as monotonically increasing integers rather than 
>> wall clock readings.) Querying at specific timestamp relies on _not_ merging 
>> deleted fragments. For short windows, like minutes or even hours, depending 
>> on your workload, this is OK. However, merging is necessary and useful to 
>> maintain the health of a database.
>>  
>> Can you elaborate what you mean by "maintain the health of a database"?  If 
>> we'd decide that we never want to delete any data in a certain MarkLogic 
>> database so that we can roll back to any point in time, what would be the 
>> down sides?  How would  the database become unhealthy?
>> 
>> We have an existing application that makes use of another database system 
>> (Datomic) exactly in that way, and we would like to carry it over to 
>> MarkLogic.  The "Inside MarkLogic" document describes point-in-time queries 
>> as "Time Travel", but what you write seems to say that using timestamps that 
>> way would be detrimental to the health of the database, so I'd like to learn 
>> more before we convert.
>> 
>> Thanks!
>> Hans
>> 
>> -- 
>> LambdaWerk GmbH
>> Oranienburger Straße 87/89
>> 10178 Berlin
>> Phone: +49 30 555 7335 0
>> Fax: +49 30 555 7335 99
>> 
>> HRB 169991 <tel:169991> B Amtsgericht Charlottenburg
>> USt-ID: DE301399951
>> Geschäftsführer:  Hans Hübner
>> 
>> http://lambdawerk.com/ <http://lambdawerk.com/>
>> 
>> 
>> _______________________________________________
>> General mailing list
>> [email protected] <mailto:[email protected]>
>> Manage your subscription at: 
>> http://developer.marklogic.com/mailman/listinfo/general 
>> <http://developer.marklogic.com/mailman/listinfo/general>
> 
> 
> _______________________________________________
> General mailing list
> [email protected] <mailto:[email protected]>
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general 
> <http://developer.marklogic.com/mailman/listinfo/general>
> 
> 
> 
> 
> -- 
> LambdaWerk GmbH
> Oranienburger Straße 87/89
> 10178 Berlin
> Phone: +49 30 555 7335 0
> Fax: +49 30 555 7335 99
> 
> HRB 169991 B Amtsgericht Charlottenburg
> USt-ID: DE301399951
> Geschäftsführer:  Hans Hübner
> 
> http://lambdawerk.com/ <http://lambdawerk.com/>
> 
> 
> _______________________________________________
> General mailing list
> [email protected] <mailto:[email protected]>
> Manage your subscription at: 
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to