If there are no search terms then documents will appear in native database order, also called "document order". This is something like RDBMS "row order". Generally speaking that won't be the same as the insertion order.
So how can we get the most recent documents in a reliable way? If you have maintain-last-modified enabled for the database, every document has a property fragment with a prop:last-modified element. The docs guide https://docs.marklogic.com/guide/app-dev/properties talks about this feature, and https://docs.marklogic.com/admin-help/database describes the database configuration. Note that enabling it won't affect documents inserted previously. You'll have to reinsert them or update them, and then they'll get a prop:last-modified timestamp as of that latest insert or update. When sorting or querying on prop:last-modified, you'll want it to be fast. Per https://docs.marklogic.com/guide/performance/order_by that will be most efficient with an element range index. But watch out: the last-modified property isn't part of the main document fragment, and sorting can't use range index data from a different fragment. So we have to sort the property fragments by prop:last-modified first. Then we can do other things with the results. Let's try that. First, create an element range index of type dateTime on {http://marklogic.com/xdmp/property}last-modified. See https://docs.marklogic.com/admin-help/range-element-index and https://docs.marklogic.com/guide/admin/range_index for more on that topic. Next we'll need some test content. Here's a query to insert 10 documents with different timestamps. (: insert test documents :) for $i in 1 to 10 let $_ := xdmp:invoke-function( function() { let $id := xdmp:integer-to-hex(xdmp:random()) return xdmp:document-insert( '/test/'||$id, element test { attribute id { $id } }), xdmp:commit() }, <options xmlns="xdmp:eval"> <transaction-mode>update</transaction-mode> </options>) let $_ := xdmp:sleep(1000) return $i => 1 2 3 4 5 6 7 8 9 10 That query uses some ML7 features, but you should be able to port it to ML5 without too much trouble. The '||' operator is like concat, or more precisely the Java '+' operator. You can use an xdmp:eval instead of the invoke-function magic, and you shouldn't need the xdmp:commit. The important bit is the xdmp:sleep call between sub-transactions, which ensures that each document has a different prop:last-modified. Let's check that. xdmp:document-properties()/prop:properties/prop:last-modified/data(.) => 2014-08-08T08:43:39-07:00 2014-08-08T08:43:43-07:00 2014-08-08T08:43:44-07:00 2014-08-08T08:43:38-07:00 2014-08-08T08:43:47-07:00 2014-08-08T08:43:45-07:00 2014-08-08T08:43:46-07:00 2014-08-08T08:43:42-07:00 2014-08-08T08:43:40-07:00 2014-08-08T08:43:41-07:00 These timestamps are in document order, which doesn't match the original insert order. In fact it looks random. But we can still get the most recent N documents using prop:last-modified. xdmp:query-trace(true()), let $count := 5 let $start := 1 let $stop := $start + $count - 1 return ( for $p in xdmp:document-properties() order by $p/prop:properties/prop:last-modified descending return text { xdmp:node-uri($p), $p/prop:properties/prop:last-modified })[$start to $stop] => /test/8819ad493f97c9dd 2014-08-08T08:43:47-07:00 /test/bb8f23b3fc0446f7 2014-08-08T08:43:46-07:00 /test/b33af6f2becf4262 2014-08-08T08:43:45-07:00 /test/95fa44068813646 2014-08-08T08:43:44-07:00 /test/b1e91203787593ad 2014-08-08T08:43:43-07:00 Remember that we generated the ids and URIs with xdmp:random, so yours will be different. Production code probably wouldn't include that xdmp:query-trace, but it lets us see if the database really used the prop:last-modified range index. The query trace output appears in ErrorLog.txt: Analyzing path for $p: xdmp:document-properties() Step 1 is searchable: xdmp:document-properties() Path is fully searchable. Gathering constraints. Step 1 contributed 1 constraint: xdmp:document-properties() Order by clause contributed 1 range ordering constraint for $p: order by $p/prop:properties/prop:last-modified descending Executing search. Selected 10 fragments to filter. That "Order by clause..." line tells us that sorting used the range index. So I'd expect this query to be efficient and to scale well as the database grows. Now we know how to fetch the N most recent URIs quickly. We could also query for URIs before a certain dateTime, using that same range index on prop:last-modified. for $p in cts:search( xdmp:document-properties(), cts:element-range-query(xs:QName('prop:last-modified'), '>', xs:dateTime('2014-08-08T08:43:42-07:00'))) order by $p/prop:properties/prop:last-modified descending return text { xdmp:node-uri($p), $p/prop:properties/prop:last-modified } => /test/8819ad493f97c9dd 2014-08-08T08:43:47-07:00 /test/bb8f23b3fc0446f7 2014-08-08T08:43:46-07:00 /test/b33af6f2becf4262 2014-08-08T08:43:45-07:00 /test/95fa44068813646 2014-08-08T08:43:44-07:00 /test/b1e91203787593ad 2014-08-08T08:43:43-07:00 This time the 'order by' wasn't strictly necessary, but it makes the results easier to read. Now that we know how to get the URIs from recently modified documents, we might want the original documents. That's pretty easy, and the technique is the same with either query. Keep in mind that the extra fn:doc call adds an O(n) factor to the query. So fetch the main document if you need to, but don't do it if the URI alone is enough. for $p in cts:search( xdmp:document-properties(), cts:element-range-query(xs:QName('prop:last-modified'), '>', xs:dateTime('2014-08-08T08:43:42-07:00'))) order by $p/prop:properties/prop:last-modified descending return doc(xdmp:node-uri($p)) => <test id="8819ad493f97c9dd"/> <test id="bb8f23b3fc0446f7"/> <test id="b33af6f2becf4262"/> <test id="95fa44068813646"/> <test id="b1e91203787593ad"/> You could use the same technique in the "N most recent" version of the query, too. -- Mike On 8 Aug 2014, at 06:04 , Chad Bishop <[email protected]> wrote: > Greetings, > > Is there any built-in functionality to retrieve the most recently added > documents to a collection or directory? > > It looks like a blank search does the trick, but would prefer something more > efficient. > > We’re still on ML 5. > > Thanks much, > > -Chad > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
