Hi Erik,

Is word count possible in marklogic? What to configure in the Admin UI 8000
, if this is possible. Can I enable word lexicon and get the frequencies?
Say I have this xml , 1000 such xml files. Now I want to do a word count on
the description element.
<info>
  <company>ibm</company>
  <year>2001</year>
  <country>US</country>
<description> This is an example description and such example benefits the
description. </description>
</info>

So, in the above xml , I want to get something like
example (2)
description (2)
This  (1) .....etc

Thanks



On Sat, Feb 21, 2015 at 9:13 PM, Maisnam Ns <[email protected]> wrote:

> Thanks Erik
>
> On Sat, Feb 21, 2015 at 8:32 PM, Erik Hennum <[email protected]>
> wrote:
>
>>  Hi, Maisnam:
>>
>>  Your query options should defined a tuple in which the first column is
>> a range index on the country and the second column is a range index on year.
>>
>>
>> <options xmlns="http://marklogic.com/appservices/search";>
>>     <tuples name="yearByCountry">
>>         <range type="xs:string" collation="http://marklogic.com/collation/";>
>>             <element ns="" name="country"/>
>>         </range>
>>         <range type="xs:gYear">
>>             <element ns="" name="year"/>
>>         </range>
>>     </tuples>
>> </options>
>>
>>
>> After writing the "yearByCountry" query options to the server, you can then 
>> use the options to request tuples from the range indexes:
>>
>>
>> QueryManager queryMgr = dbClient.newQueryManager();
>> TuplesHandle results =
>>     queryMgr.tuples(queryMgr.newValuesDefinition("yearByCountry"), new 
>> TuplesHandle());
>> Tuple[] tuples = results.getTuples();
>>
>>
>>  You can then iterate over the tuples to get the counts on the frequency
>> of co-occurrence of each country and year.
>>
>>  For more information about defining tuples:
>>
>>  http://docs.marklogic.com/guide/rest-dev/appendixb#id_90089
>> http://docs.marklogic.com/guide/rest-dev/search#id_24433
>>
>>  For more information about making a tuple request:
>>
>>
>> http://docs.marklogic.com/javadoc/client/com/marklogic/client/query/QueryManager.html#newValuesDefinition(java.lang.String)
>>
>> http://docs.marklogic.com/javadoc/client/com/marklogic/client/query/QueryManager.html#tuples(com.marklogic.client.query.ValuesDefinition,%20T)
>>
>> http://docs.marklogic.com/javadoc/client/com/marklogic/client/query/Tuple.html#getCount()
>>
>>
>>  Hoping that helps,
>>
>>
>>  Erik Hennum
>>
>>    ------------------------------
>> *From:* [email protected] [
>> [email protected]] on behalf of Maisnam Ns [
>> [email protected]]
>> *Sent:* Saturday, February 21, 2015 12:02 AM
>> *To:* MarkLogic Developer Discussion
>> *Subject:* Re: [MarkLogic Dev General] JAVA API Query formation
>>
>>     Hi Eric,
>>
>>  Given this scenario:
>> Let's say this is file 1 and there are 1000 such different files
>> <info>
>>   <company>ibm</company>
>>   <year>2001</year>
>>   <country>US</country>
>> </info>
>>
>>  How do I get the count of years by country ='US' by using Java api
>>
>> 2001 - (20)
>> 2002- (5)
>>  2009 -(0)  etc
>>
>>  Thanks
>>
>>
>>
>> On Sat, Feb 21, 2015 at 1:27 AM, Maisnam Ns <[email protected]> wrote:
>>
>>> Thanks Eric for your help. Will try to use XMLStreamWriter.
>>>
>>>  On Fri, Feb 20, 2015 at 11:09 PM, Erik Hennum <
>>> [email protected]> wrote:
>>>
>>>>   Hi, Maisnam:
>>>>
>>>> To get uncorrelated frequencies for three elements, you'll need to make
>>>> three separate requests, one for each element.
>>>>
>>>> Just so you're aware, you can also request tuples for the three
>>>> elements, but that request returns the frequencies for the
>>>> co-occurrence of values in a document and not the individual frequencies
>>>> for each element.
>>>>
>>>> By the way, the query options builder has been deprecated for several
>>>> releases and could go away in any future release.  You should instead use a
>>>> DOM (such as JDOM or XOM) or XMLStreamWriter to generate the options
>>>> XML.
>>>>
>>>>
>>>> Hoping that helps,
>>>>
>>>>
>>>>   Erik Hennum
>>>>
>>>>    ------------------------------
>>>> *From:* [email protected] [
>>>> [email protected]] on behalf of Maisnam Ns [
>>>> [email protected]]
>>>> *Sent:* Friday, February 20, 2015 2:40 AM
>>>> *To:* MarkLogic Developer Discussion
>>>> *Subject:* [MarkLogic Dev General] JAVA API Query formation
>>>>
>>>>     Hi ,
>>>>
>>>>  Can someone help me with the JAVA API query formation for the below
>>>> sample
>>>>
>>>>  Let's say this is file 1 and there are 1000 such different files
>>>>  <info>
>>>>   <company>ibm</company>
>>>>   <year>2001</year>
>>>>   <country>US</country>
>>>> </info>
>>>>
>>>>  I just want to get the country, year and the count.
>>>>
>>>>  US 2001  70
>>>>  US 2014   13
>>>>  JAPAN 2000 10
>>>>
>>>>  Something like the above, I am able to get the count of only one
>>>> element not two
>>>>
>>>> QueryOptionsHandle options = new QueryOptionsHandle().withValues(
>>>>             qob.values("product",
>>>>                     qob.range(
>>>>                         qob.elementRangeIndex(new QName("country"),
>>>>
>>>> qob.stringRangeType(QueryOptions.DEFAULT_COLLATION))),
>>>>                     "frequency-order"));
>>>>  The above query gives me
>>>>
>>>>  US 190
>>>>  CH  123
>>>>  IND  70
>>>>
>>>>
>>>>  Thanks
>>>>
>>>>
>>>>
>>>>  _______________________________________________
>>>> General mailing list
>>>> [email protected]
>>>> http://developer.marklogic.com/mailman/listinfo/general
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>>
>>
>
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to