Re: [MarkLogic Dev General] Fwd: [1.0-ml] XDMP-EXPNTREECACHEFULL

VISH RAJPUT Mon, 26 Mar 2012 05:29:04 -0700

Thanks Geert,

Is there any alternate solution to find the unique elements within a
database?


Warm Regards,
Vishnu



On Mon, Mar 26, 2012 at 5:55 PM, Geert Josten <[email protected]> wrote:

> Hi Vishnu,
>
>
>
> 90 mb isn’t much indeed, but MarkLogic is configured to keep a low memory
> footprint, even if there are 30 concurrent requests. To make that sure, the
> tree size limit (look at the database setting in the admin interface) is
> usually pretty low. I have 8Gb and still it is set to no more than 85mb by
> default. But you can increase it if you like.
>
>
>
> A more streaming approach like my advice attempts to achieve to some
> extend helps keeping the footprint low, and keep MarkLogic fast.
>
>
>
> Kind regards,
>
> Geert
>
>
>
> *Van:* [email protected] [mailto:
> [email protected]] *Namens *VISH RAJPUT
> *Verzonden:* maandag 26 maart 2012 14:17
> *Aan:* MarkLogic Developer Discussion
> *Onderwerp:* Re: [MarkLogic Dev General] Fwd: [1.0-ml]
> XDMP-EXPNTREECACHEFULL
>
>
>
> Thanks Geert,
>
>
>
> But still it shows *XDMP-EXPNTREECACHEFULL: 
> distinct-values(collection("ContentAnalysis")//*/local-name()) --
> Expanded tree cache full on host.... *the database overall size is only
> 90MB i don't think it is so huge data for marklogic....
>
>
>
>
>
> Regards,
>
> Vishnu
>
>
>
> On Mon, Mar 26, 2012 at 1:25 PM, Geert Josten <[email protected]>
> wrote:
>
> Hi Vishnu,
>
>
>
> Your FLWOR expression won’t return distinct names, since you are applying
> the function to each individual name. You should write:
>
>
>
> distinct-values(
>
>     for $a in //*
>
>     return $a
>
> )
>
>
>
> Or better:
>
>
>
> distinct-values(collection()//*/local-name())
>
>
>
> But this still might not perform well, or still max out on list or tree
> caches. This approach is creating a complete list of all element names
> first, and starts applying distinct-values only thereafter. You might
> consider taking multiple steps, like per doc first, and then clustering per
> 100 files, and only then all clusters. You could also just take 100 random
> samples, and use that. That doesn’t guarantee a 100% complete list, but it
> remains performant even if your database grows 10 or 100 fold.
>
>
>
> Kind regards,
>
> Geert
>
>
>
> *Van:* [email protected] [mailto:
> [email protected]] *Namens *VISH RAJPUT
> *Verzonden:* maandag 26 maart 2012 8:29
> *Aan:* [email protected]
> *Onderwerp:* [MarkLogic Dev General] Fwd: [1.0-ml] XDMP-EXPNTREECACHEFULL
>
>
>
> The size of the all files is 90 MB approx.
>
> ---------- Forwarded message ----------
> From: *VISH RAJPUT* <[email protected]>
> Date: Mon, Mar 26, 2012 at 11:56 AM
> Subject: [1.0-ml] XDMP-EXPNTREECACHEFULL
> To: [email protected]
>
>
> Hi,
>
>
>
> I have 2000 files in Marklogic database within a single forest and i want
> to find out the unique element name from this database for the whole 2000
> files. For this i wrote the below query:-
>
>
>
> for $a in //*
>
> return distinct-values($a/local-name()))
>
>
>
> but by this i got an error "*[1.0-ml] XDMP-EXPNTREECACHEFULL" * what
> should i do?
>
>
>
>
>
> Regards,
>
> Vishnu Singh
>
>
>
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>
>
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>
>

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Re: [MarkLogic Dev General] Fwd: [1.0-ml] XDMP-EXPNTREECACHEFULL

Reply via email to