Re: [MarkLogic Dev General] "Hot Swapping" large data sets.

Jason Hunter Thu, 18 Mar 2010 11:07:12 -0700

For a single batch load, I like that, but if you do repeated loads you'll have 
to be creating new roles for every batch to distinguish the new content from 
the old.  It seems mentally cheaper/lighter to me to use collections.  My 2c.


-jh-

On Mar 18, 2010, at 9:47 AM, Danny Sokolsky wrote:

> The URI privilege does not control access to the document, it specifies 
> whether you can create a document in that URI space.
>  
> You can do what Keith suggests by putting a read permission on each document 
> that is associated with a role.  Then, when you are ready, grant that role to 
> a role your users already have.  To do this, you would have to add several 
> permissions during the load.  For example, you might add a read and update 
> permission for a “loader” role, and also add a read permission for a 
> “content-user” role.  Then, after you are satisfied that your content is the 
> way you want it, you can give the “content-user” role to the user of your 
> application.
>  
> -Danny
>  
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Keith L. 
> Breinholt
> Sent: Thursday, March 18, 2010 9:34 AM
> To: General Mark Logic Developer Discussion
> Subject: RE: [MarkLogic Dev General] "Hot Swapping" large data sets.
>  
> Another way to allow you to load and update sets and then only make them 
> visible when you are done is to load the content with a unique URI privilege 
> that is assigned to your loader/enricher program.
>  
> Then when you are done and the content is ready you can add that privilege to 
> the role of any users/applications that need to see it.  That way only 
> completed content is visible and it appears ‘instantaneously’ when the 
> privilege is added to the role.
>  
> Keith L. Breinholt
> [email protected]
>  
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Jason Hunter
> Sent: Thursday, March 18, 2010 12:10 AM
> To: General Mark Logic Developer Discussion
> Subject: Re: [MarkLogic Dev General] "Hot Swapping" large data sets.
>  
> On Mar 17, 2010, at 5:23 AM, Lee, David wrote:
>  
> 
> I need to be updating some largish (1G+) sets of documents fairly atomically.
> That is, I'd like to update all the documents and perform some operations 
> like adding properties etc,
> then all at once make the updates visible.   The update process could take 
> several hours.
> Currently this document set shares the same forest as other document sets.
> Its not possible to split these up because the app needs cross-query across 
> all the document sets.
>  
> Any suggestions on how to accomplish this ?
>  
> What happens if you try loading everything as part of a single XCC call 
> passing the large array of files?
>  
> If you want to follow Wayne's advice on using collections, I suppose you'd 
> want to put each batch of docs in a uniquely named collection.  Then you can 
> run your queries against fn:collection($seq) when $seq is the sequence of 
> collections that have been loaded so far.  Or, perhaps more simply, you can 
> do a cts:not-query() against the cts:collection-query("latest") and thus 
> exclude the most recent batch but allow all other docs that were loaded 
> before.  It keeps the new collection in the dark basically.  Handy, 
> efficient, and if each batch gets its own ID then you can easily exclude any 
> batch.
>  
> Point-in-time would do something similar, and is suitable if you're always 
> doing just one bulk load at a time.  Then you can use the point in time to 
> control the visibility.
>  
> -jh-
>  
> 
> 
> NOTICE: This email message is for the sole use of the intended recipient(s) 
> and may contain confidential and privileged information. Any unauthorized 
> review, use, disclosure or distribution is prohibited. If you are not the 
> intended recipient, please contact the sender by reply email and destroy all 
> copies of the original message.
>  
> _______________________________________________
> General mailing list
> [email protected]
> http://xqzone.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Re: [MarkLogic Dev General] "Hot Swapping" large data sets.

Reply via email to