Hi David, Your suggestion is really good to me. I'm at the time that I need to pull a lot of different data from different database and then reorganize them into marklogic. I was thinking how to do that but hasn't found good ways. You give me good directions to start. I'll look into it.
Thanks, Helen On May 6, 2010, at 3:04 PM, Lee, David wrote: > You might find the MarkLogic extension module to xmlsh very useful for > this kind of process. > For example the "list" command (like ls or find in unix) will list all > documents in your database. > You can use that with standard unix (or windows or mac) programs for > easy command line interaction to MarkLogic. > > For example > > import module ml=marklogic ; > ml:list | grep test > > would find your test.xml file > > Just a suggestion. Its also good at preparing documents locally to > submit to marklogic. I found that is often easier then > uploading them to ML first THEN splitting them up. That way you dont > end up with temporary data in the DB ever. > For example you can split a big file into little ones THEN put post them > > > xsplit test.xml > ml:put -r -baseuri /Docs/ x*.xml > > -David > > > http://www.xmlsh.org/ModuleMarkLogic > > > ---------------------------------------- > David A. Lee > Senior Principal Software Engineer > Epocrates, Inc. > [email protected] > 812-482-5224 > > > > > > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of helen chen > Sent: Thursday, May 06, 2010 2:15 PM > To: General Mark Logic Developer Discussion > Subject: Re: [MarkLogic Dev General] question on query > > Hi Geert, > > I found it, it was my fault. Those jps data was load from outside. I > first got data from database, make them into one big file, then I loaded > this big file into marklogic as a test.xml, then I loop through the sub > node to separate each jps node to be a separate file in the uri I want. > But I forgot to delete this test.xml file, and all the values are coming > from this file. I deleted this test.xml and I get correct value. > > Thanks so much for your idea that use cts:search and base-uri, that's > how I found it. and I'm merging now. > > Helen > > On May 6, 2010, at 1:52 PM, Geert Josten wrote: > >> Hi Helen, >> >> No sorry, my mistake, you did write that each article/jps was in a > separate xml file. >> >> Your code looks okay. There must be something subtle going on. Have > you thought of taking the filter query and put it in a cts:search and > ask for the base-uri of each result to see which results are returned > that way? >> >> I could also imagine that you are running with admin user, which has > as side effect that deleted docs are visible to you, and could therefore > pollute the results you get. Try merging the database to get rid of > deleted fragments.. >> >> Kind regards, >> Geert >> >>> -----Original Message----- >>> From: [email protected] >>> [mailto:[email protected]] On Behalf Of >>> helen chen >>> Sent: donderdag 6 mei 2010 19:37 >>> To: General Mark Logic Developer Discussion >>> Subject: Re: [MarkLogic Dev General] question on query >>> >>> Sorry I didn't describe clearly. >>> >>> each <article> tag means one xml file, and each <jps> tag >>> also means one xml file. >>> The <article> tag holds the true article data, it is the top >>> element in the xml file the <jps> tag is some meta >>> information that does not exist in article data, it is on >>> volume level, which means each coden and volume will have a >>> separate xml file. and jps tag is the top element in the xml >>> file. the coden volume and publication_year tag are also >>> unique in each xml file. >>> >>> We don't do special thing for fragment when loading data. >>> But if they are in separate file, that should means they are >>> in different fragment. Correct me if it is not. >>> >>> if I do the query like the following, I'll get correct value, >>> but the big element-values query did not. >>> >>> >>> fn:doc()/nsjps:jps[(./nsjps:coden eq "AAA") and >>> (./nsjps:volume eq "12")]/nsjps:publication_year/text() >>> >>> >>> >>> >>> >>> >>> Thanks, Helen >>> >>> >>> >>> >>> >>> >>> <article>...<coden>AAA</coden><volume>1</volume><issue>1</issu >>> e><paper>123</paper>....</article> >>> >>> >>> >>> On May 6, 2010, at 12:55 PM, Geert Josten wrote: >>> >>> >>> Hi Helen, >>> >>> Looking more closely at your problem now. You say you >>> have one big file with all articles, which works correctly. >>> You also have on big file with all jps elements. Could it be >>> that you have added 'article' as fragment root? I think you >>> also need 'jps' as fragment root. >>> >>> Your filter query will restrict the element-values to a >>> set of fragments from which it 'takes' the appropriate >>> values. But it will take all within the same fragment. So if >>> you don't fragment on 'jps', your filter query will always >>> return the whole xml (with all jps elements) as one fragment, >>> and hence element-values will always return all possible values.. >>> >>> HTH >>> >>> Kind regards, >>> Geert >>> >>> >>> >>> -----Original Message----- >>> >>> >>> From: [email protected] >>> >>> >>> >>> [mailto:[email protected]] On Behalf Of >>> >>> >>> helen chen >>> >>> >>> Sent: donderdag 6 mei 2010 18:46 >>> >>> >>> To: General Mark Logic Developer Discussion >>> >>> >>> Subject: Re: [MarkLogic Dev General] question on query >>> >>> >>> >>> Hi Geert, >>> >>> >>> >>> Yes, the collation in the code is the collation >>> I used to set >>> >>> >>> the element-range indexes. >>> >>> >>> >>> I actually tried to do the query without specifying the >>> >>> >>> collation like following, and I tried to take >>> collation out >>> >>> >>> one by one, but marklogic complains that there >>> is no element >>> >>> >>> range index for it, so I have to put collation there. >>> >>> >>> >>> >>> cts:element-values(fn:QName("nsjps","publication_year"),(), >>> >>> >>> ("ascending"), >>> >>> >>> cts:and-query(( >>> >>> >>> >>> >>> cts:element-range-query(fn:QName("nsjps","coden"), "=", "AAA", ()), >>> >>> >>> >>> >>> cts:element-range-query(fn:QName("nsjps","volume"), "=", "12", ()) >>> >>> >>> ), ()) >>> >>> >>> ) >>> >>> >>> >>> >>> >>> Helen >>> >>> >>> >>> On May 6, 2010, at 12:33 PM, Geert Josten wrote: >>> >>> >>> >>> >>> Hi Helen, >>> >>> >>> >>> >>> >>> You are using a particular collation in your code. Do >>> >>> >>> all indexes specify this same collation? >>> >>> >>> >>> >>> >>> Kind regards, >>> >>> >>> Geert >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> drs. G.P.H. (Geert) Josten >>> >>> >>> Consultant >>> >>> >>> >>> >>> >>> >>> >>> >>> <image894d23....@724072c4b4fa45a4 >>> >>> >>> <mailto:[email protected]> > >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> Daidalos BV >>> >>> >>> >>> >>> >>> Hoekeindsehof 1-4 >>> >>> >>> 2665 JZ Bleiswijk >>> >>> >>> >>> >>> >>> T +31 (0)10 850 1200 >>> >>> >>> F +31 (0)10 850 1199 >>> >>> >>> >>> >>> >>> <mailto:[email protected]> >>> [email protected] >>> >>> >>> www.daidalos.nl <http://www.daidalos.nl/> >>> >>> >>> >>> >>> >>> KvK 27164984 >>> >>> >>> >>> >>> >>> >>> P Please consider the environment before >>> printing this mail. >>> >>> >>> De informatie - verzonden in of met dit e-mailbericht - >>> >>> >>> is afkomstig van Daidalos BV en is uitsluitend >>> bestemd voor >>> >>> >>> de geadresseerde. Indien u dit bericht onbedoeld hebt >>> >>> >>> ontvangen, verzoeken wij u het te verwijderen. Aan dit >>> >>> >>> bericht kunnen geen rechten worden ontleend. >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> ________________________________ >>> >>> >>> >>> From: [email protected] >>> >>> >>> >>> [mailto:[email protected]] On Behalf Of >>> >>> >>> helen chen >>> >>> >>> Sent: donderdag 6 mei 2010 18:13 >>> >>> >>> To: General Mark Logic Developer Discussion >>> >>> >>> Cc: helen chen >>> >>> >>> Subject: [MarkLogic Dev General] question on query >>> >>> >>> >>> >>> >>> >>> >>> >>> I have a query works on one structure but not >>> >>> >>> on another set of structure, I couldn't see the clue, >>> >>> >>> >>> >>> >>> >>> >>> >>> for the working one: our data is like in the >>> >>> >>> following structure >>> >>> >>> >>> >>> >>> >>> <article>...<coden></coden><volume></volume><issue></issue><pa >>> >>> >>> per></paper>....</article> >>> >>> >>> >>> >>> >>> the following are examples of the data, each >>> >>> >>> tag <article> means one xml in marklogic: >>> >>> >>> >>> >>> >>> article 1: >>> >>> >>> >>> <article>...<coden>AAA</coden><volume>1</volume><issue>1</issu >>> >>> >>> e><paper>123</paper>....</article> >>> >>> >>> article 2: >>> >>> >>> >>> <article>....<coden>AAA</coden><volume>1</volume><issue>2</iss >>> >>> >>> ue><paper>233</paper>...</article> >>> >>> >>> ... >>> >>> >>> >>> >>> >>> I want to get issue list for the coden and >>> >>> >>> volume, so I use the query >>> >>> >>> >>> >>> >>> >>> >>> >>> cts:element-values(fn:QName("ns1","issue"),(), >>> >>> >>> >>> ("collation=http://marklogic.com/collation/en/MO", "descending"), >>> >>> >>> cts:and-query(( >>> >>> >>> >>> >>> >>> >>> cts:element-range-query(fn:QName("ns1","coden"), "=", "AAA", >>> >>> >>> "collation=http://marklogic.com/collation/en/MO"), >>> >>> >>> >>> >>> >>> >>> cts:element-range-query(fn:QName("ns1","volume"), "=", "1", >>> >>> >>> "collation=http://marklogic.com/collation/en/MO") >>> >>> >>> )) >>> >>> >>> ) >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> this one works very good, and it gives me the >>> >>> >>> correct issue list for different coden/volume. >>> >>> >>> >>> >>> >>> >>> >>> >>> ------- for the not working one: >>> >>> >>> I have another situation that I want to use >>> >>> >>> query similar with the above one: >>> >>> >>> I have xml file structure like : <jps >>> >>> >>> xmlns="nsjps"> <coden></coden> <volume></volume> >>> >>> >>> <publication_year></publication_year></jps> >>> >>> >>> >>> >>> >>> >>> >>> >>> the following every line is one xml file >>> >>> >>> ... >>> >>> >>> <jps xmlns="nsjps"> <coden>AAA</coden> >>> >>> >>> <volume>12</volume> >>> <publication_year>1968</publication_year></jps> >>> >>> >>> <jps xmlns="nsjps"> <coden>AAA</coden> >>> >>> >>> <volume>11</volume> >>> <publication_year>1967</publication_year></jps> >>> >>> >>> <jps xmlns="nsjps"> <coden>AAA</coden> >>> >>> >>> <volume>10</volume> >>> <publication_year>1966</publication_year></jps> >>> >>> >>> <jps xmlns="nsjps"> <coden>AAB</coden> >>> >>> >>> <volume>15</volume> >>> <publication_year>1978</publication_year></jps> >>> >>> >>> ... >>> >>> >>> >>> >>> >>> >>> >>> >>> here every coden and volume should have unique >>> >>> >>> value as publication_year. I thought if I use >>> the following >>> >>> >>> query for coden=AAA and volume=12, I should get >>> only 1968 as >>> >>> >>> return, but I got all the values for all the >>> publication_year >>> >>> >>> for coden=AAA and volume=12, which is 1968, >>> 1967, 1966... >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> cts:element-values(fn:QName("nsjps","publication_year"),(), >>> >>> >>> >>> ("collation=http://marklogic.com/collation/en/MO", "descending"), >>> >>> >>> cts:and-query(( >>> >>> >>> >>> >>> >>> cts:element-range-query(fn:QName("nsjps","coden"), "=", >>> >>> >>> "AAA", >>> "collation=http://marklogic.com/collation/en/MO"), >>> >>> >>> >>> >>> >>> >>> cts:element-range-query(fn:QName("nsjps","volume"), "=", >>> >>> >>> "12", >>> "collation=http://marklogic.com/collation/en/MO") >>> >>> >>> )) >>> >>> >>> ) >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> Any one can help me? >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> Thanks, Helen >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> >>> >>> General mailing list >>> >>> >>> [email protected] >>> >>> >>> http://developer.marklogic.com/mailman/listinfo/general >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> General mailing list >>> [email protected] >>> http://developer.marklogic.com/mailman/listinfo/general >>> >>> >>> >>> >> _______________________________________________ >> General mailing list >> [email protected] >> http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
