Hi, Maisnam:
Value and tuple requests are supported over uri or collection lexicons, range
indexes, field range indexes, and geospatial indexes.
Given that the document set is quite small, you could create a string range
index over the description, execute a value request on the description, split
the description into words on the client side, and calculate the per-document
word counts on the client side.
For larger document sets, you can calculate the number of documents with each
word.
Using the cts:element-words() function in server-side XQuery (or, in MarkLogic
8, using the cts.elementWords() function in server-side JavaScript), you can
use an element word lexicon to get a list of words used in an element. The
documentation shows a technique that can be adapted to count the number of
documents with each word:
http://docs.marklogic.com/guide/search-dev/lexicon#id_95439
The Java API provides an interface for installing and executing server-side
XQuery or JavaScript:
http://docs.marklogic.com/guide/java/resourceservices#id_27702
Hoping that helps,
Erik Hennum
________________________________
From: [email protected]
[[email protected]] on behalf of Maisnam Ns
[[email protected]]
Sent: Saturday, February 21, 2015 9:19 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] JAVA API Query formation
Hi Erik,
Is word count possible in marklogic? What to configure in the Admin UI 8000 ,
if this is possible. Can I enable word lexicon and get the frequencies?
Say I have this xml , 1000 such xml files. Now I want to do a word count on the
description element.
<info>
<company>ibm</company>
<year>2001</year>
<country>US</country>
<description> This is an example description and such example benefits the
description. </description>
</info>
So, in the above xml , I want to get something like
example (2)
description (2)
This (1) .....etc
Thanks
On Sat, Feb 21, 2015 at 9:13 PM, Maisnam Ns
<[email protected]<mailto:[email protected]>> wrote:
Thanks Erik
On Sat, Feb 21, 2015 at 8:32 PM, Erik Hennum
<[email protected]<mailto:[email protected]>> wrote:
Hi, Maisnam:
Your query options should defined a tuple in which the first column is a range
index on the country and the second column is a range index on year.
<options xmlns="http://marklogic.com/appservices/search">
<tuples name="yearByCountry">
<range type="xs:string" collation="http://marklogic.com/collation/">
<element ns="" name="country"/>
</range>
<range type="xs:gYear">
<element ns="" name="year"/>
</range>
</tuples>
</options>
After writing the "yearByCountry" query options to the server, you can then use
the options to request tuples from the range indexes:
QueryManager queryMgr = dbClient.newQueryManager();
TuplesHandle results =
queryMgr.tuples(queryMgr.newValuesDefinition("yearByCountry"), new
TuplesHandle());
Tuple[] tuples = results.getTuples();
You can then iterate over the tuples to get the counts on the frequency of
co-occurrence of each country and year.
For more information about defining tuples:
http://docs.marklogic.com/guide/rest-dev/appendixb#id_90089
http://docs.marklogic.com/guide/rest-dev/search#id_24433
For more information about making a tuple request:
http://docs.marklogic.com/javadoc/client/com/marklogic/client/query/QueryManager.html#newValuesDefinition(java.lang.String)
http://docs.marklogic.com/javadoc/client/com/marklogic/client/query/QueryManager.html#tuples(com.marklogic.client.query.ValuesDefinition,%20T)
http://docs.marklogic.com/javadoc/client/com/marklogic/client/query/Tuple.html#getCount()
Hoping that helps,
Erik Hennum
________________________________
From:
[email protected]<mailto:[email protected]>
[[email protected]<mailto:[email protected]>]
on behalf of Maisnam Ns [[email protected]<mailto:[email protected]>]
Sent: Saturday, February 21, 2015 12:02 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] JAVA API Query formation
Hi Eric,
Given this scenario:
Let's say this is file 1 and there are 1000 such different files
<info>
<company>ibm</company>
<year>2001</year>
<country>US</country>
</info>
How do I get the count of years by country ='US' by using Java api
2001 - (20)
2002- (5)
2009 -(0) etc
Thanks
On Sat, Feb 21, 2015 at 1:27 AM, Maisnam Ns
<[email protected]<mailto:[email protected]>> wrote:
Thanks Eric for your help. Will try to use XMLStreamWriter.
On Fri, Feb 20, 2015 at 11:09 PM, Erik Hennum
<[email protected]<mailto:[email protected]>> wrote:
Hi, Maisnam:
To get uncorrelated frequencies for three elements, you'll need to make three
separate requests, one for each element.
Just so you're aware, you can also request tuples for the three elements, but
that request returns the frequencies for the co-occurrence of values in a
document and not the individual frequencies for each element.
By the way, the query options builder has been deprecated for several releases
and could go away in any future release. You should instead use a DOM (such as
JDOM or XOM) or XMLStreamWriter to generate the options XML.
Hoping that helps,
Erik Hennum
________________________________
From:
[email protected]<mailto:[email protected]>
[[email protected]<mailto:[email protected]>]
on behalf of Maisnam Ns [[email protected]<mailto:[email protected]>]
Sent: Friday, February 20, 2015 2:40 AM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] JAVA API Query formation
Hi ,
Can someone help me with the JAVA API query formation for the below sample
Let's say this is file 1 and there are 1000 such different files
<info>
<company>ibm</company>
<year>2001</year>
<country>US</country>
</info>
I just want to get the country, year and the count.
US 2001 70
US 2014 13
JAPAN 2000 10
Something like the above, I am able to get the count of only one element not two
QueryOptionsHandle options = new QueryOptionsHandle().withValues(
qob.values("product",
qob.range(
qob.elementRangeIndex(new QName("country"),
qob.stringRangeType(QueryOptions.DEFAULT_COLLATION))),
"frequency-order"));
The above query gives me
US 190
CH 123
IND 70
Thanks
_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]<mailto:[email protected]>
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general