Hi there,

Many thanks for your support.  Sorry it has taken me a few days to respond;
I had a quick run today with the new head of code and everything seems to be
fine, but I need to dig in a little bit more on the functionality you have
documented on your last email.  

BTW, Please feel free to use the example taxonomy... some bits were taken
directly for Wikipedia and the top level of the hierarchy has no business
sense whatsoever!!  However it is a good test case to quickly check
consistency of the indexing by simply counting rows.  Glad you find it
interesting enough and I am thankful for your trouble putting a C++ test for
the case.

I should be able to give you more feed back in a few days, once my agenda
has cleared out a little bit.

Thanks again for your support!!  

Justo.
  

-----Original Message-----
From: Kesheng Wu [mailto:[email protected]] 
Sent: 08 December 2010 17:39
To: FastBit Users
Cc: Justo Ruiz Ferrer
Subject: Re: [FastBit-users] Keyword indexes

Hi, Justo,

A set of test functions have been added to FastBit's testing suite to
exercise the keyword indexes based on the output jrf.cpp attached to
the previous message.  I presume we have your blessing in using the
risk categories.  If that is a problem, please let us know soon to we
can replace the list with something else.

A small set of functions have been added to extract keywords from text
without the need of externally provided term-document list.  This
allows one to specify an indexing option of "keywords" without an
explicit docidname.  In this case, the new parser is used to generate
the keyword index.  This additional feature should make the keyword
index more usable than before.  In the case of your risk categories,
because many keywords contain embedded space, an option to allow these
keywords to be recognized is to place comas between the keywords.
Such coma-separated-values format is frequently used and may be a
reasonable option for your data.  Hope this is useful for you.

When you get a chance to test the new code, please let us know how it
works for you.

Thanks.

John

PS: You can check out the latest source code from SVN repository using
the following command

svn checkout https://codeforge.lbl.gov/anonscm/fastbit

Endelec LLP is a company registered in England and Wales. 
Registered number: OC356543
Registered office: 30 City Road, London EC1Y 2AB

_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to