You might find the MarkLogic extension module to xmlsh very useful for
this kind of process.
For example the "list" command (like ls or find in unix) will list all
documents in your database.
You can use that with standard unix (or windows or mac) programs for
easy command line interaction to MarkLogic.

For example

import module ml=marklogic ;
ml:list  | grep test

would find your test.xml file

Just a suggestion.  Its also good at preparing documents locally to
submit to marklogic.  I found that is often easier then
uploading them to ML first THEN splitting them up.  That way you dont
end up with temporary data in the DB ever.
For example you can split a big file into little ones THEN put post them


xsplit test.xml
ml:put -r -baseuri /Docs/ x*.xml  

-David


http://www.xmlsh.org/ModuleMarkLogic


----------------------------------------
David A. Lee
Senior Principal Software Engineer
Epocrates, Inc.
[email protected]
812-482-5224






-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of helen chen
Sent: Thursday, May 06, 2010 2:15 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] question on query

Hi Geert,

I found it, it was my fault.  Those jps data was load from outside. I
first got data from database, make them into one big file, then I loaded
this big file into marklogic as a test.xml, then I loop through the sub
node to separate each jps node to be a separate file in the uri I want.
But I forgot to delete this test.xml file, and all the values are coming
from this file. I deleted this test.xml and I get correct value.

Thanks so much for your idea that use cts:search and base-uri, that's
how I found it. and I'm merging now.

Helen

On May 6, 2010, at 1:52 PM, Geert Josten wrote:

> Hi Helen,
> 
> No sorry, my mistake, you did write that each article/jps was in a
separate xml file.
> 
> Your code looks okay. There must be something subtle going on. Have
you thought of taking the filter query and put it in a cts:search and
ask for the base-uri of each result to see which results are returned
that way?
> 
> I could also imagine that you are running with admin user, which has
as side effect that deleted docs are visible to you, and could therefore
pollute the results you get. Try merging the database to get rid of
deleted fragments..
> 
> Kind regards,
> Geert
> 
>> -----Original Message-----
>> From: [email protected] 
>> [mailto:[email protected]] On Behalf Of 
>> helen chen
>> Sent: donderdag 6 mei 2010 19:37
>> To: General Mark Logic Developer Discussion
>> Subject: Re: [MarkLogic Dev General] question on query
>> 
>> Sorry I didn't describe clearly.
>> 
>> each <article> tag means one xml file, and each <jps> tag 
>> also means one xml file. 
>> The <article> tag holds the true article data, it is the top 
>> element in the xml file the <jps> tag is some meta 
>> information that does not exist in article data, it is on 
>> volume level, which means each coden and volume will have a 
>> separate xml file. and jps tag is the top element in the xml 
>> file. the coden volume and publication_year tag are also 
>> unique in each xml file.
>> 
>> We don't do special thing for fragment when loading data.  
>> But if they are in separate file, that should means they are 
>> in different fragment. Correct me if it is not.
>> 
>> if I do the query like the following, I'll get correct value, 
>> but the big element-values query did not.
>> 
>> 
>> fn:doc()/nsjps:jps[(./nsjps:coden eq "AAA") and 
>> (./nsjps:volume eq "12")]/nsjps:publication_year/text()
>> 
>> 
>> 
>> 
>> 
>> 
>> Thanks, Helen
>> 
>> 
>> 
>> 
>> 
>>      
>> <article>...<coden>AAA</coden><volume>1</volume><issue>1</issu
>> e><paper>123</paper>....</article>
>> 
>> 
>> 
>> On May 6, 2010, at 12:55 PM, Geert Josten wrote:
>> 
>> 
>>      Hi Helen,
>>      
>>      Looking more closely at your problem now. You say you 
>> have one big file with all articles, which works correctly. 
>> You also have on big file with all jps elements. Could it be 
>> that you have added 'article' as fragment root? I think you 
>> also need 'jps' as fragment root.
>>      
>>      Your filter query will restrict the element-values to a 
>> set of fragments from which it 'takes' the appropriate 
>> values. But it will take all within the same fragment. So if 
>> you don't fragment on 'jps', your filter query will always 
>> return the whole xml (with all jps elements) as one fragment, 
>> and hence element-values will always return all possible values..
>>      
>>      HTH
>>      
>>      Kind regards,
>>      Geert
>>      
>>      
>> 
>>              -----Original Message-----
>>              
>> 
>>              From: [email protected] 
>>              
>> 
>>              
>> [mailto:[email protected]] On Behalf Of 
>>              
>> 
>>              helen chen
>>              
>> 
>>              Sent: donderdag 6 mei 2010 18:46
>>              
>> 
>>              To: General Mark Logic Developer Discussion
>>              
>> 
>>              Subject: Re: [MarkLogic Dev General] question on query
>>              
>> 
>> 
>>              Hi Geert,
>>              
>> 
>> 
>>              Yes, the collation in the code is the collation 
>> I used to set 
>>              
>> 
>>              the element-range indexes.
>>              
>> 
>> 
>>              I actually tried to do the query without specifying the 
>>              
>> 
>>              collation like following, and I tried to take 
>> collation out 
>>              
>> 
>>              one by one, but marklogic complains that there 
>> is no element 
>>              
>> 
>>              range index for it, so I have to put collation there.
>>              
>> 
>> 
>>              
>> cts:element-values(fn:QName("nsjps","publication_year"),(),   
>>              
>> 
>>              ("ascending"), 
>>              
>> 
>>                                            cts:and-query((
>>              
>> 
>> 
>>              
>> cts:element-range-query(fn:QName("nsjps","coden"), "=", "AAA",  ()),
>>              
>> 
>> 
>>              
>> cts:element-range-query(fn:QName("nsjps","volume"), "=", "12",  ()) 
>>              
>> 
>>                                                         ), ())
>>              
>> 
>>                                                      )
>>              
>> 
>> 
>> 
>> 
>>              Helen
>>              
>> 
>> 
>>              On May 6, 2010, at 12:33 PM, Geert Josten wrote:
>>              
>> 
>> 
>> 
>>              Hi Helen,
>>              
>> 
>>              
>>              
>> 
>>              You are using a particular collation in your code. Do 
>>              
>> 
>>              all indexes specify this same collation?
>>              
>> 
>>              
>>              
>> 
>>              Kind regards,
>>              
>> 
>>              Geert
>>              
>> 
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              drs. G.P.H. (Geert) Josten
>>              
>> 
>>              Consultant
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              <image894d23....@724072c4b4fa45a4 
>>              
>> 
>>              <mailto:[email protected]> >
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              Daidalos BV
>>              
>> 
>>              
>>              
>> 
>>              Hoekeindsehof 1-4
>>              
>> 
>>              2665 JZ Bleiswijk
>>              
>> 
>>              
>>              
>> 
>>              T +31 (0)10 850 1200
>>              
>> 
>>              F +31 (0)10 850 1199
>>              
>> 
>>              
>>              
>> 
>>              <mailto:[email protected]> 
>> [email protected]
>>              
>> 
>>              www.daidalos.nl <http://www.daidalos.nl/> 
>>              
>> 
>>              
>>              
>> 
>>              KvK 27164984
>>              
>> 
>> 
>>              
>>              
>> 
>>              P Please consider the environment before 
>> printing this mail.
>>              
>> 
>>              De informatie - verzonden in of met dit e-mailbericht - 
>>              
>> 
>>              is afkomstig van Daidalos BV en is uitsluitend 
>> bestemd voor 
>>              
>> 
>>              de geadresseerde. Indien u dit bericht onbedoeld hebt 
>>              
>> 
>>              ontvangen, verzoeken wij u het te verwijderen. Aan dit 
>>              
>> 
>>              bericht kunnen geen rechten worden ontleend.
>>              
>> 
>> 
>>              
>>              
>> 
>> 
>>              
>>              
>> 
>> 
>>              
>>              
>> 
>> 
>>              ________________________________
>>              
>> 
>> 
>>              From: [email protected] 
>>              
>> 
>>              
>> [mailto:[email protected]] On Behalf Of 
>>              
>> 
>>              helen chen
>>              
>> 
>>              Sent: donderdag 6 mei 2010 18:13
>>              
>> 
>>              To: General Mark Logic Developer Discussion
>>              
>> 
>>              Cc: helen chen
>>              
>> 
>>              Subject: [MarkLogic Dev General] question on query
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              I have a query works on one structure but not 
>>              
>> 
>>              on another set of structure, I couldn't see the clue,
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              for the working one:  our data is like in the 
>>              
>> 
>>              following structure
>>              
>> 
>>              
>>              
>> 
>>              
>> <article>...<coden></coden><volume></volume><issue></issue><pa
>>              
>> 
>>              per></paper>....</article>
>>              
>> 
>>              
>>              
>> 
>>              the following are examples of the data, each 
>>              
>> 
>>              tag <article> means one xml in marklogic:
>>              
>> 
>>              
>>              
>> 
>>              article 1:  
>>              
>> 
>>              
>> <article>...<coden>AAA</coden><volume>1</volume><issue>1</issu
>>              
>> 
>>              e><paper>123</paper>....</article>
>>              
>> 
>>              article 2: 
>>              
>> 
>>              
>> <article>....<coden>AAA</coden><volume>1</volume><issue>2</iss
>>              
>> 
>>              ue><paper>233</paper>...</article>
>>              
>> 
>>              ...
>>              
>> 
>>              
>>              
>> 
>>              I want to get issue list for the coden and 
>>              
>> 
>>              volume, so I use the query
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              cts:element-values(fn:QName("ns1","issue"),(),  
>>              
>> 
>>              
>> ("collation=http://marklogic.com/collation/en/MO";, "descending"), 
>>              
>> 
>>                                            cts:and-query((
>>              
>> 
>>                                            
>>              
>> 
>>              
>> cts:element-range-query(fn:QName("ns1","coden"), "=", "AAA",  
>>              
>> 
>>              "collation=http://marklogic.com/collation/en/MO";),
>>              
>> 
>>                                            
>>              
>> 
>>              
>> cts:element-range-query(fn:QName("ns1","volume"), "=", "1",   
>>              
>> 
>>              "collation=http://marklogic.com/collation/en/MO";) 
>>              
>> 
>>                                                         ))  
>>              
>> 
>>                                               )
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              this one works very good, and it gives me the 
>>              
>> 
>>              correct issue list for different coden/volume.
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              ------- for the not working one:
>>              
>> 
>>              I have another situation that I want to use 
>>              
>> 
>>              query similar with the above one:
>>              
>> 
>>              I have xml file structure like : <jps 
>>              
>> 
>>              xmlns="nsjps"> <coden></coden> <volume></volume> 
>>              
>> 
>>              <publication_year></publication_year></jps>
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              the following every line is one xml file
>>              
>> 
>>              ...
>>              
>> 
>>              <jps xmlns="nsjps"> <coden>AAA</coden> 
>>              
>> 
>>              <volume>12</volume> 
>> <publication_year>1968</publication_year></jps>
>>              
>> 
>>              <jps xmlns="nsjps"> <coden>AAA</coden> 
>>              
>> 
>>              <volume>11</volume> 
>> <publication_year>1967</publication_year></jps>
>>              
>> 
>>              <jps xmlns="nsjps"> <coden>AAA</coden> 
>>              
>> 
>>              <volume>10</volume> 
>> <publication_year>1966</publication_year></jps>
>>              
>> 
>>              <jps xmlns="nsjps"> <coden>AAB</coden> 
>>              
>> 
>>              <volume>15</volume> 
>> <publication_year>1978</publication_year></jps>
>>              
>> 
>>              ...
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              here every coden and volume should have unique 
>>              
>> 
>>              value as publication_year. I thought if I use 
>> the following 
>>              
>> 
>>              query for coden=AAA and volume=12, I should get 
>> only 1968 as 
>>              
>> 
>>              return, but I got all the values for all the 
>> publication_year 
>>              
>> 
>>              for coden=AAA and volume=12, which is 1968, 
>> 1967, 1966...
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              
>> cts:element-values(fn:QName("nsjps","publication_year"),(),   
>>              
>> 
>>              
>> ("collation=http://marklogic.com/collation/en/MO";, "descending"), 
>>              
>> 
>>                                            cts:and-query((
>>              
>> 
>>                                            
>>              
>> 
>>              cts:element-range-query(fn:QName("nsjps","coden"), "=", 
>>              
>> 
>>              "AAA",  
>> "collation=http://marklogic.com/collation/en/MO";),
>>              
>> 
>>                                            
>>              
>> 
>>              
>> cts:element-range-query(fn:QName("nsjps","volume"), "=", 
>>              
>> 
>>              "12",   
>> "collation=http://marklogic.com/collation/en/MO";) 
>>              
>> 
>>                                                         ))
>>              
>> 
>>                                                      )
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              Any one can help me?
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>>              Thanks, Helen
>>              
>> 
>>              
>>              
>> 
>>              
>>              
>> 
>> 
>> 
>>              _______________________________________________
>>              
>> 
>>              General mailing list
>>              
>> 
>>              [email protected]
>>              
>> 
>>              http://developer.marklogic.com/mailman/listinfo/general
>>              
>> 
>>              
>>              
>> 
>> 
>> 
>> 
>>      _______________________________________________
>>      General mailing list
>>      [email protected]
>>      http://developer.marklogic.com/mailman/listinfo/general
>>      
>> 
>> 
>> 
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general

_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to