Thanks Erik, the JAVA API for installing and executing server-side XQUERY was really an eye opener for me, I didn't even know it exist . It really helped me a lot.
Thanks again, it solved many of my use cases from the answers given by you. On Sat, Feb 21, 2015 at 11:37 PM, Erik Hennum <[email protected]> wrote: > Hi, Maisnam: > > Value and tuple requests are supported over uri or collection lexicons, > range indexes, field range indexes, and geospatial indexes. > > Given that the document set is quite small, you could create a string > range index over the description, execute a value request on the > description, split the description into words on the client side, and > calculate the per-document word counts on the client side. > > For larger document sets, you can calculate the number of documents with > each word. > > Using the cts:element-words() function in server-side XQuery (or, in > MarkLogic 8, using the cts.elementWords() function in server-side > JavaScript), you can use an element word lexicon to get a list of words > used in an element. The documentation shows a technique that can be > adapted to count the number of documents with each word: > > http://docs.marklogic.com/guide/search-dev/lexicon#id_95439 > > The Java API provides an interface for installing and executing > server-side XQuery or JavaScript: > > http://docs.marklogic.com/guide/java/resourceservices#id_27702 > > > Hoping that helps, > > > Erik Hennum > > ------------------------------ > *From:* [email protected] [ > [email protected]] on behalf of Maisnam Ns [ > [email protected]] > *Sent:* Saturday, February 21, 2015 9:19 AM > > *To:* MarkLogic Developer Discussion > *Subject:* Re: [MarkLogic Dev General] JAVA API Query formation > > Hi Erik, > > Is word count possible in marklogic? What to configure in the Admin UI > 8000 , if this is possible. Can I enable word lexicon and get the > frequencies? > Say I have this xml , 1000 such xml files. Now I want to do a word count > on the description element. > <info> > <company>ibm</company> > <year>2001</year> > <country>US</country> > <description> This is an example description and such example benefits > the description. </description> > </info> > > So, in the above xml , I want to get something like > example (2) > description (2) > This (1) .....etc > > Thanks > > > > On Sat, Feb 21, 2015 at 9:13 PM, Maisnam Ns <[email protected]> wrote: > >> Thanks Erik >> >> On Sat, Feb 21, 2015 at 8:32 PM, Erik Hennum <[email protected]> >> wrote: >> >>> Hi, Maisnam: >>> >>> Your query options should defined a tuple in which the first column is >>> a range index on the country and the second column is a range index on year. >>> >>> >>> <options xmlns="http://marklogic.com/appservices/search"> >>> <tuples name="yearByCountry"> >>> <range type="xs:string" collation="http://marklogic.com/collation/"> >>> <element ns="" name="country"/> >>> </range> >>> <range type="xs:gYear"> >>> <element ns="" name="year"/> >>> </range> >>> </tuples> >>> </options> >>> >>> >>> After writing the "yearByCountry" query options to the server, you can then >>> use the options to request tuples from the range indexes: >>> >>> >>> QueryManager queryMgr = dbClient.newQueryManager(); >>> TuplesHandle results = >>> queryMgr.tuples(queryMgr.newValuesDefinition("yearByCountry"), new >>> TuplesHandle()); >>> Tuple[] tuples = results.getTuples(); >>> >>> >>> You can then iterate over the tuples to get the counts on the >>> frequency of co-occurrence of each country and year. >>> >>> For more information about defining tuples: >>> >>> http://docs.marklogic.com/guide/rest-dev/appendixb#id_90089 >>> http://docs.marklogic.com/guide/rest-dev/search#id_24433 >>> >>> For more information about making a tuple request: >>> >>> >>> http://docs.marklogic.com/javadoc/client/com/marklogic/client/query/QueryManager.html#newValuesDefinition(java.lang.String) >>> >>> http://docs.marklogic.com/javadoc/client/com/marklogic/client/query/QueryManager.html#tuples(com.marklogic.client.query.ValuesDefinition,%20T) >>> >>> http://docs.marklogic.com/javadoc/client/com/marklogic/client/query/Tuple.html#getCount() >>> >>> >>> Hoping that helps, >>> >>> >>> Erik Hennum >>> >>> ------------------------------ >>> *From:* [email protected] [ >>> [email protected]] on behalf of Maisnam Ns [ >>> [email protected]] >>> *Sent:* Saturday, February 21, 2015 12:02 AM >>> *To:* MarkLogic Developer Discussion >>> *Subject:* Re: [MarkLogic Dev General] JAVA API Query formation >>> >>> Hi Eric, >>> >>> Given this scenario: >>> Let's say this is file 1 and there are 1000 such different files >>> <info> >>> <company>ibm</company> >>> <year>2001</year> >>> <country>US</country> >>> </info> >>> >>> How do I get the count of years by country ='US' by using Java api >>> >>> 2001 - (20) >>> 2002- (5) >>> 2009 -(0) etc >>> >>> Thanks >>> >>> >>> >>> On Sat, Feb 21, 2015 at 1:27 AM, Maisnam Ns <[email protected]> >>> wrote: >>> >>>> Thanks Eric for your help. Will try to use XMLStreamWriter. >>>> >>>> On Fri, Feb 20, 2015 at 11:09 PM, Erik Hennum < >>>> [email protected]> wrote: >>>> >>>>> Hi, Maisnam: >>>>> >>>>> To get uncorrelated frequencies for three elements, you'll need to >>>>> make three separate requests, one for each element. >>>>> >>>>> Just so you're aware, you can also request tuples for the three >>>>> elements, but that request returns the frequencies for the >>>>> co-occurrence of values in a document and not the individual frequencies >>>>> for each element. >>>>> >>>>> By the way, the query options builder has been deprecated for several >>>>> releases and could go away in any future release. You should instead use >>>>> a >>>>> DOM (such as JDOM or XOM) or XMLStreamWriter to generate the options >>>>> XML. >>>>> >>>>> >>>>> Hoping that helps, >>>>> >>>>> >>>>> Erik Hennum >>>>> >>>>> ------------------------------ >>>>> *From:* [email protected] [ >>>>> [email protected]] on behalf of Maisnam Ns [ >>>>> [email protected]] >>>>> *Sent:* Friday, February 20, 2015 2:40 AM >>>>> *To:* MarkLogic Developer Discussion >>>>> *Subject:* [MarkLogic Dev General] JAVA API Query formation >>>>> >>>>> Hi , >>>>> >>>>> Can someone help me with the JAVA API query formation for the below >>>>> sample >>>>> >>>>> Let's say this is file 1 and there are 1000 such different files >>>>> <info> >>>>> <company>ibm</company> >>>>> <year>2001</year> >>>>> <country>US</country> >>>>> </info> >>>>> >>>>> I just want to get the country, year and the count. >>>>> >>>>> US 2001 70 >>>>> US 2014 13 >>>>> JAPAN 2000 10 >>>>> >>>>> Something like the above, I am able to get the count of only one >>>>> element not two >>>>> >>>>> QueryOptionsHandle options = new QueryOptionsHandle().withValues( >>>>> qob.values("product", >>>>> qob.range( >>>>> qob.elementRangeIndex(new QName("country"), >>>>> >>>>> qob.stringRangeType(QueryOptions.DEFAULT_COLLATION))), >>>>> "frequency-order")); >>>>> The above query gives me >>>>> >>>>> US 190 >>>>> CH 123 >>>>> IND 70 >>>>> >>>>> >>>>> Thanks >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> General mailing list >>>>> [email protected] >>>>> http://developer.marklogic.com/mailman/listinfo/general >>>>> >>>>> >>>> >>> >>> _______________________________________________ >>> General mailing list >>> [email protected] >>> http://developer.marklogic.com/mailman/listinfo/general >>> >>> >> > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > >
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
