That's an interesting problem, especially if you ever want to use WebDAV for
directory listings.  Most of the content I deal with is named by a fixed
length numerical ID, so I break up the information hierarchically to reduce
the amount of information in any one directory URI.  I also found it helpful
to leverage the URI lexicon to spawn tasks to delete content - as long as I
do not overflow the task server queue.

 

Tim Meagher

 

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Lee, David
Sent: Wednesday, December 09, 2009 12:47 PM
To: Michael Blakeley; General Mark Logic Developer Discussion
Subject: RE: [MarkLogic Dev General] Cannot delete directory with 1mil docs-
XDMP-MEMORY

 

Thanks for the suggestion

I am running 4.1-3, I have plenty of swap space.

 

I tried the bulk deletes but they were taking about 1 minute per 1000
documents to delete ...

I gave up after a few hours.

 

I've created a new DB and am starting the process of reloading now, about
2/3 through then I'll delete the old forest.

 

I've come to the conclusion, that atleast on my system which is admittedly
not that powerful (32bit linux, 4GB ram,  2.8ghz, ) that ML doesnt handle
directories with > 1mil entries very well.

I try to add more then that and run into all sorts of memory problems.

I try to *delete* that directory and cant.

 

It also doesnt handle individual files with > 1mil fragments that well but
atleast it handles them.

For my experimental case, I'm trying now a hybrid approach which is to bulk
up 1000 "rows" per file and keeping the # of files in a directory in the
1000's not million's ... 

 

 

 

-----Original Message-----

From: Michael Blakeley [mailto:[email protected]] 

Sent: Wednesday, December 09, 2009 12:33 PM

To: General Mark Logic Developer Discussion

Cc: Lee, David

Subject: Re: [MarkLogic Dev General] Cannot delete directory with 1mil docs
- XDMP-MEMORY

 

The XDMP-MEMORY message does mean that the host couldn't allocation the 

needed memory. In this case that was probably because the transaction 

was too large to fit in memory. If you aren't already using 4.1-3, I'd 

upgrade - just in case this is a known problem that has already been fixed.

 

If 4.1-3 doesn't help, then I suppose you could increase the swap 

space... but I don't think you'd like the performance. You might be able 

to reduce the sizes of the group-level caches, but that might lead to 

*CACHEFULL errors.

 

So as Geert suggested, clearing the forest is probably the fastest 

solution. Or if you don't mind spending more time on it, you could 

delete in blocks of 1000 documents.

 

   for $i in xdmp:directory($path, 'infinity')[1 to 1000]

   return xdmp:document-delete(xdmp:node-uri($i))

 

You could automate this using xdmp:spawn(). You could also use 

cts:uris() with a cts:directory-query(), if you have the uri lexicon 

available.

 

-- Mike

 

On 2009-12-09 05:59, Lee, David wrote:

> My joys of success were premature.

> I ran into memory problems trying to load the full set of documents, it
died after about 1mil.

> So I tried to delete the directory and now I'm getting

> 

> Exception running: :query

> com.marklogic.xcc.exceptions.XQueryException: XDMP-MEMORY:
xdmp:directory-delete

> ("/RxNorm/rxnsat/") -- Memory exhausted

> in /eval, on line 1

> 

> Arg !!!!

> 

> I've tried to change various memory settings to no avail.  Any clue how to
delete this directory ?

> or should I start to delete the files piecemeal.

> 

> Suggestions welcome.

> 

> -David

> 

> 

> ----------------------------------------

> David A. Lee

> Senior Principal Software Engineer

> Epocrates, Inc.

> [email protected]<mailto:[email protected]>

> 812-482-5224

> 

> 

> 

 

 

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

Reply via email to