On Mar 17, 2010, at 5:23 AM, Lee, David wrote:
> I need to be updating some largish (1G+) sets of documents fairly atomically.
> That is, I'd like to update all the documents and perform some operations
> like adding properties etc,
> then all at once make the updates visible. The update process could take
> several hours.
> Currently this document set shares the same forest as other document sets.
> Its not possible to split these up because the app needs cross-query across
> all the document sets.
>
> Any suggestions on how to accomplish this ?
What happens if you try loading everything as part of a single XCC call passing
the large array of files?
If you want to follow Wayne's advice on using collections, I suppose you'd want
to put each batch of docs in a uniquely named collection. Then you can run
your queries against fn:collection($seq) when $seq is the sequence of
collections that have been loaded so far. Or, perhaps more simply, you can do
a cts:not-query() against the cts:collection-query("latest") and thus exclude
the most recent batch but allow all other docs that were loaded before. It
keeps the new collection in the dark basically. Handy, efficient, and if each
batch gets its own ID then you can easily exclude any batch.
Point-in-time would do something similar, and is suitable if you're always
doing just one bulk load at a time. Then you can use the point in time to
control the visibility.
-jh-
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general