RE: [MarkLogic Dev General] using XQuery on Word documents

Yves Dolce Tue, 11 Dec 2007 13:10:08 -0800

Right, Danny.
First I enabled the URI Lexicon and fire the query you suggested. It did not 
work but others did!
Thanks for the help!!
 
FWIW, the error I got is related to date format, I guess:
 
XDMP-COMPARE: <Date>2008-10-01T00:00:00</Date> gt xs:date("2008-01-01") -- 
Items not comparable: xs:date("2008-01-01") lt 
xdt:untypedAtomic("2008-10-01T00:00:00")> 
--------------------------------------------------> From: "Yves Dolce 
(hotmail.com)" <[EMAIL PROTECTED]>> Sent: Monday, December 10, 2007 4:31 PM> 
To: "General Mark Logic Developer Discussion" > 
<[email protected]>> Subject: Re: [MarkLogic Dev General] using 
XQuery on Word documents> > > Thanks Danny. Just by looking at the syntax, I'm 
pretty this is what I > > want. I'll try this tomorrow and will confirm. Thanks 
again.> >> > --------------------------------------------------> > From: "Danny 
Sokolsky" <[EMAIL PROTECTED]>> > Sent: Monday, December 10, 2007 4:06 PM> > To: 
"General Mark Logic Developer Discussion" > > 
<[email protected]>> > Subject: RE: [MarkLogic Dev General] using 
XQuery on Word documents> >> >> Hi Yves,> >>> >> To do this efficiently, it is 
very helpful to have a URI lexicon. A URI > >> lexicon gives you very fast 
access the URI of every document in the > >> database. You enable the URI 
lexicon in the Admin Interface database > >> config page for your database.> 
>>> >> Once you have the URI lexicon created (and reindexing has completed), 
you > >> can do something like this to get what you want:> >>> >> for $x in 
cts:uris()> >> where fn:ends-with($x, ".docx") and> >> xdmp:zip-get(doc($x), 
"customXml/item1.xml")/Customer/Date> >> gt xs:date("2008-01-01")> >> return> 
>> $x> >>> >> If you do it without the URI lexicon, you will probably need to 
do it in > >> batches, because to get the URIs you need to first fetch the 
document and > >> then do xdmp:node-uri to find its URI. This can effectively 
attempt to > >> put the entire database in memory, and you therefore would 
probably need > >> to do it in batches without the URI lexicon.> >>> >> If you 
have a lot of docx's in your database, you still probably want to > >> do this 
in batches.> >>> >> Is this what you were looking for?> >>> >> -Danny> >>> >> 
From: [EMAIL PROTECTED] > >> [mailto:[EMAIL PROTECTED] On Behalf Of Yves Dolce> 
>> Sent: Monday, December 10, 2007 2:27 PM> >> To: 
[email protected]> >> Subject: [MarkLogic Dev General] using 
XQuery on Word documents> >>> >> This is a question that will have a simple 
answer. If only I knew more > >> about XQuery...> >>> >> If I run the following 
line in CQ:> >> xdmp:zip-get(doc("Contract.docx"), "customXml/item1.xml")> >>> 
>> I get:> >> <Customer>> >> <Date>2008-11-15T00:00:00</Date>> >> 
<CompanyName>Bebop Corporation</CompanyName>> >> <FirstName>Erick</FirstName>> 
>> <LastName>Trojan</LastName>> >> <SSN>1111-22-3333</SSN>> >> <Address>Av. 
Revolucion 841, DF, CP 03910, Mexico</Address>> >> <ContactTitle>Test 
Manager</ContactTitle>> >> <Phone>+52 (55) 6666-66666</Phone>> >> </Customer>> 
>>> >> How should I express a query that essentially says: for each docx file 
in > >> the DB, get me its customXml/item1.xml part, if it has one, and the > 
>> <Date> element in it is greater than 1/1/2008.> >>> >> Does my question make 
sense? Thanks!> >> _______________________________________________> >> General 
mailing list> >> [email protected]> >> 
http://xqzone.com/mailman/listinfo/general> >>> > 
_______________________________________________> > General mailing list> > 
[email protected]> > http://xqzone.com/mailman/listinfo/general> 
>

_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

RE: [MarkLogic Dev General] using XQuery on Word documents

Reply via email to