Re: A question about FacetField constructor

2014-06-22 Thread Shai Erera
What do you mean by does not index anything? Do you get an exception when
you add a String[] with more than one element?

You should probably call conf.setHierarchical(dimension), but if you don't
do that you should receive an IllegalArgumentException telling you to do
that...

Shai


On Sun, Jun 22, 2014 at 6:34 AM, west suhanic 
wrote:

> Hello All:
>
> I am building sample code using lucene v4.8.1 to explore
> the new facet API. The problem I am having is that if I pass
> a populated string array nothing gets indexed while if
> I pass only the first element of the string array that value gets indexed.
> The code found below shows the case that works and the case that does not
> work. What am I doing wrong?
>
> Start of code sample*
>
> void showStuff( String... va )
> {
>   /** This code permits out the contents of va successfully.**/
>   for( int ii = 0 ; ii < va.length ; ii++ )
>   System.out.println( "value[" + ii + "] " + va[ii] );
> }
>
> for( final Map< String, String[] > fd : allFacetData )
> {
>
> final Document doc = new Document();
> for( final Map.Entry< String, String[] > entry :
> fd.entrySet() )
> {
> final String key = entry.getKey();
> String[] value = entry.getValue();
> showStuff( value );
>
> /**  This call indexes successfully **/
> final FacetField newFF = new FacetField(
> key, value[0] );
>
> /**
>* This call will not index anything if
> the value String array
>* has more than one element.
>*final FacetField newFF = new
> FacetField( key, value );
>*/
> doc.add( newFF );
> }
>
> try
> {
> final Document theBuildDoc =
> configFacetsHandle.
> build( taxoWriter, doc );
> indexWriter.addDocument( theBuildDoc );
> indexWriter.addDocument(
> configFacetsHandle.buil
> d( taxoWriter, doc ) );
> }
> catch( IOException ioe )
> {
> eMsg.append( method );
> eMsg.append(  " failed with the exception "
> );
> eMsg.append( ioe.toString() );
> return constantValuesInterface.FAILURE;
> }
> }
>
> ***End of code sample***
>
> regards,
>
> West Suhanic
>


AW: fuzzy/case insensitive AnalyzingSuggester )

2014-06-22 Thread Clemens Wyss DEV
Oli, 
thanks for your valuable inputs!

> Generally, we found it beneficial to not combine all functionality in a 
> single suggester
Makes absolutely sense, but doesn't help keeping RAM-load low ;) unless you go 
with WFSTs. 

What we have done so far is build a term-index based on the terms of the 
corresponding (data)index. I.e. an index always comes in pair with its 
corresponding term index.

-Ursprüngliche Nachricht-
Von: Oliver Christ [mailto:ochr...@ebsco.com] 
Gesendet: Freitag, 20. Juni 2014 15:52
An: java-user@lucene.apache.org
Betreff: RE: fuzzy/case insensitive AnalyzingSuggester )

Hi Clemens,

I haven't yet built a suggester which combines all three, and am not aware of 
one. I'd love to have one though ;-)

Case- and diacritics insensitivity is supported out-of-the-box by the analyzing 
suggesters, including the FuzzySuggester. The logic is in the Analyzer.

I haven't yet tried out AnalyzingInfixSuggester, and haven't investigated 
whether it's possible to combine that with FuzzySuggester (which also is an 
analyzing suggester).

Due to memory constraints, we build infix suggesters by adding each relevant 
substring, but use WFST suggesters with payloads as the base, to reduce RAM 
load at runtime. We call the analyzer in the dictionary iterator. At search 
time, we look up the surface form (completion) in a secondary index using the 
payload as a key (and for deduping).

If FuzzySuggester supports payloads (haven't checked), you could get an infix 
suggester using the same approach. That will lead to large automata, and as 
you'd have to look up the completion in a secondary index, you'd never use the 
surface form returned by the automaton itself, so it's a waste of space. WFSTs 
are more space-efficient but don't support payloads (if I remember correctly) 
and there's no fuzzy WFST suggester either :(

Generally, we found it beneficial to not combine all functionality in a single 
suggester, but use separate automata in a cascaded model. We first look up 
completions in the prefix non-fuzzy suggester. Based on several criteria, we 
may then consult the infix suggester, and if needed, the fuzzy suggester. The 
rationale is that we don't want high-ranking fuzzy or infix hits to fill up the 
completion list while there are good (but less popular) prefix hits. Having 
control over which suggester is used when, and how its specific suggestions are 
merged into the final result list, helps improving the user experience, at 
least with our use cases.

Cheers, Oli

-Original Message-
From: Clemens Wyss DEV [mailto:clemens...@mysign.ch] 
Sent: Friday, June 20, 2014 6:47 AM
To: java-user@lucene.apache.org
Subject: AW: fuzzy/case insensitive AnalyzingSuggester )

Sorry for re-asking. 
Has anyone implemented an AnalyzingSuggester which 
- is fuzzy
- is case insensitive (or must/should this be implemented by the analyzer?)
- does infix search
[- has a small memory footprint]

-Ursprüngliche Nachricht-
Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch] 
Gesendet: Freitag, 13. Juni 2014 14:53
An: java-user@lucene.apache.org
Betreff: fuzzy/case insensitive AnalyzingSuggester )

Looking for an AnalyzingSuggester which supports
- fuzzyness
- case insensitivity
- small (in memors) footprint (*)

(*)Just tried to "hand" my big IndexReader (see oher post " [lucene 4.6] NPE 
when calling IndexReader#openIfChanged") into JaspellLookup. Got an OOM.
Is there any (Jaspell)Lookup implementation that can handle really big indexes 
(by swapping  out part of the "lookup-table")?


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

B�CB��[��X��ܚX�KK[XZ[
��]�K]\�\�][��X��ܚX�PX�[�K�\X�K�ܙ�B��܈Y][ۘ[��[X[��K[XZ[
��]�K]\�\�Z[X�[�K�\X�K�ܙ�B�B


Re: Lucene Facets Module 4.8.1

2014-06-22 Thread Jigar Shah
I will try to dig more on your suggestions, and also assert FacetsConfig
object.

While debugging i found, buildFacetsResult(...) method from
DrillSideways.java

Its internally invoking following constructor from
FastTaxonomyFacetCounts.java

FastTaxonomyFacetCounts() {
this(FacetsConfig.DEFAULT_INDEX_FIELD_NAME, taxoReader, config, fc); //
FacetsConfig.DEFAULT_INDEX_FIELD_NAME is '$facets'
}

Shouldn't it invoke following constructor with correct indexFieldName ? In
my case indexFieldName as 'city' which has dimension 'CITY'.

 FastTaxonomyFacetCounts(String indexFieldName, TaxonomyReader taxoReader,
FacetsConfig config, FacetsCollector fc) throws IOException {
super(indexFieldName, taxoReader, config);
...
}

Thanks
Jigar Shah.



On Sat, Jun 21, 2014 at 11:01 PM, Shai Erera  wrote:

> If you can, while in debug mode try to note the instance ID of the
> FacetsConfig, and assert it is indeed the same (i.e. indexConfig ==
> searchConfig).
>
> Shai
>
>
> On Sat, Jun 21, 2014 at 8:26 PM, Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
> > Are you sure it's the same FacetsConfig at search time?  Because the
> > exception implies your CITY field didn't have
> > config.setIndexFieldName("CITY", "city") called.
> >
> > Or, can you try commenting out 'config.setIndexFieldName("CITY",
> > "city")' at index time and see if the exception still happens?
> >
> > Mike McCandless
> >
> > http://blog.mikemccandless.com
> >
> >
> > On Sat, Jun 21, 2014 at 1:08 AM, Jigar Shah 
> wrote:
> > > Thanks for helping me.
> > >
> > > Yes, i did couple of things:
> > >
> > > Below is simple code for indexing which i use.
> > >
> > > TrackingIndexWriter nrtWriter
> > > DirectoryTaxonomyWriter taxoWriter = ...
> > > 
> > > FacetsConfig config = new FacetConfig();
> > > config.setHierarchical("CITY", true)
> > > config.setMultiValued("CITY", true);
> > > config.setIndexFieldName("CITY","city") // I kept dimName different
> from
> > > indexFieldName
> > > 
> > > Added indexing searchable fields...
> > > 
> > >
> > > doc.add( new FacetField("CITY", "India", "Gujarat", "Vadodara" ))
> > > doc.add( new FacetField("CITY", "India", "Gujarat", "Ahmedabad" ))
> > >
> > >  nrtWriter.addDocument(config.build(taxoWriter, doc));
> > >
> > > Below is code which i use for searching
> > >
> > > TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoWriter);
> > >
> > > Query query = ...
> > > IndexSearcher searcher = ...
> > > DrillDownQuery ddq = new DrillDownQuery(config, query);
> > > DrillSideways ds = new DrillSideways(searcher, config, taxoReader); //
> > > Config object is same which i created before
> > > DrillSidewaysResult result = ds.search(query, null, null, start +
> limit,
> > > null, true, true)
> > > ...
> > > Facets f = result.facets
> > > FacetResult fr = f.getTopChildren(5, "CITY") [Exception is geneated]//
> > > Didn't perform any drill-down,really, its just original query for first
> > > time, but wrapped in DrillDownQuery.
> > >
> > > ... and below gives me empty collection.
> > >
> > > List frs= f.getAllDims(5)
> > >
> > > I debug source code and found, it internally calls
> > >
> > > FastTaxonomyFacetCounts(indexFieldName, taxoReader, config) // Config
> > > object is same which i created before
> > >
> > > which then calls
> > >
> > > IntTaxonomyFacets(indexFieldName, taxoReader, config) // Config object
> is
> > > same which i created before
> > >
> > > And during this calls the value of indexFieldName is "$facets defined
> by
> > > constant  'public static final String DEFAULT_INDEX_FIELD_NAME =
> > "$facets";'
> > > in FacetsConfig.
> > >
> > > My question is if i am using same FacetsConfig while indexing and
> > > searching. why its not identifying correct name of field, and goes for
> > > "$facets"
> > >
> > > Please correct me if i understood wrong. or correct way to solve above
> > > problem.
> > >
> > > Many Thanks.
> > > Jigar Shah.
> >
> > -
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
>


Re: Lucene Facets Module 4.8.1

2014-06-22 Thread Shai Erera
OK I see. I think the code works OK though. It documents that you should
call the other constructor if you specify a custom indexFieldName for some
of the dimensions.

Currently if you index dimensions under different indexFieldNames, you
should initialize a FacetsCounts per indexFieldName. There's no way for the
default ctor of FastTaxonomyFacetCounts to determine which indexFieldName
to use as it doesn't know which dimensions you're going to ask to count.

Hope that helps.

Shai


On Sun, Jun 22, 2014 at 4:05 PM, Jigar Shah  wrote:

> I will try to dig more on your suggestions, and also assert FacetsConfig
> object.
>
> While debugging i found, buildFacetsResult(...) method from
> DrillSideways.java
>
> Its internally invoking following constructor from
> FastTaxonomyFacetCounts.java
>
> FastTaxonomyFacetCounts() {
> this(FacetsConfig.DEFAULT_INDEX_FIELD_NAME, taxoReader, config, fc); //
> FacetsConfig.DEFAULT_INDEX_FIELD_NAME is '$facets'
> }
>
> Shouldn't it invoke following constructor with correct indexFieldName ? In
> my case indexFieldName as 'city' which has dimension 'CITY'.
>
>  FastTaxonomyFacetCounts(String indexFieldName, TaxonomyReader taxoReader,
> FacetsConfig config, FacetsCollector fc) throws IOException {
> super(indexFieldName, taxoReader, config);
> ...
> }
>
> Thanks
> Jigar Shah.
>
>
>
> On Sat, Jun 21, 2014 at 11:01 PM, Shai Erera  wrote:
>
> > If you can, while in debug mode try to note the instance ID of the
> > FacetsConfig, and assert it is indeed the same (i.e. indexConfig ==
> > searchConfig).
> >
> > Shai
> >
> >
> > On Sat, Jun 21, 2014 at 8:26 PM, Michael McCandless <
> > luc...@mikemccandless.com> wrote:
> >
> > > Are you sure it's the same FacetsConfig at search time?  Because the
> > > exception implies your CITY field didn't have
> > > config.setIndexFieldName("CITY", "city") called.
> > >
> > > Or, can you try commenting out 'config.setIndexFieldName("CITY",
> > > "city")' at index time and see if the exception still happens?
> > >
> > > Mike McCandless
> > >
> > > http://blog.mikemccandless.com
> > >
> > >
> > > On Sat, Jun 21, 2014 at 1:08 AM, Jigar Shah 
> > wrote:
> > > > Thanks for helping me.
> > > >
> > > > Yes, i did couple of things:
> > > >
> > > > Below is simple code for indexing which i use.
> > > >
> > > > TrackingIndexWriter nrtWriter
> > > > DirectoryTaxonomyWriter taxoWriter = ...
> > > > 
> > > > FacetsConfig config = new FacetConfig();
> > > > config.setHierarchical("CITY", true)
> > > > config.setMultiValued("CITY", true);
> > > > config.setIndexFieldName("CITY","city") // I kept dimName different
> > from
> > > > indexFieldName
> > > > 
> > > > Added indexing searchable fields...
> > > > 
> > > >
> > > > doc.add( new FacetField("CITY", "India", "Gujarat", "Vadodara" ))
> > > > doc.add( new FacetField("CITY", "India", "Gujarat", "Ahmedabad" ))
> > > >
> > > >  nrtWriter.addDocument(config.build(taxoWriter, doc));
> > > >
> > > > Below is code which i use for searching
> > > >
> > > > TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoWriter);
> > > >
> > > > Query query = ...
> > > > IndexSearcher searcher = ...
> > > > DrillDownQuery ddq = new DrillDownQuery(config, query);
> > > > DrillSideways ds = new DrillSideways(searcher, config, taxoReader);
> //
> > > > Config object is same which i created before
> > > > DrillSidewaysResult result = ds.search(query, null, null, start +
> > limit,
> > > > null, true, true)
> > > > ...
> > > > Facets f = result.facets
> > > > FacetResult fr = f.getTopChildren(5, "CITY") [Exception is
> geneated]//
> > > > Didn't perform any drill-down,really, its just original query for
> first
> > > > time, but wrapped in DrillDownQuery.
> > > >
> > > > ... and below gives me empty collection.
> > > >
> > > > List frs= f.getAllDims(5)
> > > >
> > > > I debug source code and found, it internally calls
> > > >
> > > > FastTaxonomyFacetCounts(indexFieldName, taxoReader, config) // Config
> > > > object is same which i created before
> > > >
> > > > which then calls
> > > >
> > > > IntTaxonomyFacets(indexFieldName, taxoReader, config) // Config
> object
> > is
> > > > same which i created before
> > > >
> > > > And during this calls the value of indexFieldName is "$facets defined
> > by
> > > > constant  'public static final String DEFAULT_INDEX_FIELD_NAME =
> > > "$facets";'
> > > > in FacetsConfig.
> > > >
> > > > My question is if i am using same FacetsConfig while indexing and
> > > > searching. why its not identifying correct name of field, and goes
> for
> > > > "$facets"
> > > >
> > > > Please correct me if i understood wrong. or correct way to solve
> above
> > > > problem.
> > > >
> > > > Many Thanks.
> > > > Jigar Shah.
> > >
> > > -
> > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > > For additional commands, e-mail: java-user-h...@lucene.apache.org
> > >
> > >
> >
>


Re: EarlyTerminatingSortingCollector help needed..

2014-06-22 Thread Ravikumar Govindarajan
Thanks for your reply & clarifications

What do you mean by "When I use a SortField instead"? Unless you are
> using early termination, Collector.collect is supposed to be called
> for every matching document



For a normal sorting-query, on a top-level searcher, I execute

TopDocs docs = searcher.search(query, 50, sortField)

Then I can issue reader.document() for final list of exactly 50 docs, which
gives me a global order across segments but at the obvious cost of memory...

SortingMergePolicy + ETSC will make me do 50*N [N=no.of.segments] collects,
which could increase cost of seeks when each segment collects considerable
hits...

 - you can afford the merging overhead (ie. for heavy indexing
> workloads, this might not be the best solution)
>  - there is a single sort order that is used for most queries
>  - you don't need any feature that requires to collect all documents
> (like computing the total hit count or facets).


Our use-case fits perfectly on all these 3 points and thats why we wanted
to explore this. But our final set of results must also be globally
ordered. May be it's mistake to assume that Sorting can be entirely
replaced with SMP + ETSC...

I would not advise to use the stored fields API, even in the context
> of early termination. Doc values should be more efficient here?


I read your excellent blog on stored-fields compression, where you've
mentioned that stored-fields now take only one random seek. [
http://blog.jpountz.net/post/35667727458/stored-fields-compression-in-lucene-4-1
]

If so, then what could make DocValues still a winner?

--
Ravi


On Sat, Jun 21, 2014 at 6:41 PM, Adrien Grand  wrote:

> Hi Ravikumar,
>
> On Fri, Jun 20, 2014 at 12:14 PM, Ravikumar Govindarajan
>  wrote:
> > If my "numDocsToCollect" = 50 and no.of. segments = 15, then
> > collector.collect() will be called 750 times.
>
> That is the worst-case indeed. However if some of your segments have
> less than 50 matches, `collect` will only be called on those matches.
>
> > When I use a SortField instead, then TopFieldDocs does the sorting for
> all
> > segments and collector.collect() will be called only 50 times...
>
> What do you mean by "When I use a SortField instead"? Unless you are
> using early termination, Collector.collect is supposed to be called
> for every matching document.
>
> > Assuming a stored-field seek for every collector.collect(), will it be
> > advisable to still persist with ETSC? Was it introduced as a trade-off
> b/n
> > memory & disk?
>
> I would not advise to use the stored fields API, even in the context
> of early termination. Doc values should be more efficient here?
>
> The trade-off is not really about memory and disk. What it tries to
> achieve is to make queries much faster provided that:
>  - you can afford the merging overhead (ie. for heavy indexing
> workloads, this might not be the best solution)
>  - there is a single sort order that is used for most queries
>  - you don't need any feature that requires to collect all documents
> (like computing the total hit count or facets).
>
> --
> Adrien
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: A question about FacetField constructor

2014-06-22 Thread west suhanic
Hello:

>What do you mean by does not index anything?

When I do a search the value returned for the "dim" set to "Publish Date"
is null. If I pass through value[0] the publish date year is returned by
the search.

setHierarchical was called.

When a String[] with more than one element is passed an exception is not
thrown.

I am open to all suggestions as to what I am missing.

regards,

west suhanic


On Sun, Jun 22, 2014 at 3:23 AM, Shai Erera  wrote:

> What do you mean by does not index anything? Do you get an exception when
> you add a String[] with more than one element?
>
> You should probably call conf.setHierarchical(dimension), but if you don't
> do that you should receive an IllegalArgumentException telling you to do
> that...
>
> Shai
>
>
> On Sun, Jun 22, 2014 at 6:34 AM, west suhanic 
> wrote:
>
>> Hello All:
>>
>> I am building sample code using lucene v4.8.1 to explore
>> the new facet API. The problem I am having is that if I pass
>> a populated string array nothing gets indexed while if
>> I pass only the first element of the string array that value gets indexed.
>> The code found below shows the case that works and the case that does not
>> work. What am I doing wrong?
>>
>> Start of code sample*
>>
>> void showStuff( String... va )
>> {
>>   /** This code permits out the contents of va
>> successfully.**/
>>   for( int ii = 0 ; ii < va.length ; ii++ )
>>   System.out.println( "value[" + ii + "] " + va[ii] );
>> }
>>
>> for( final Map< String, String[] > fd : allFacetData )
>> {
>>
>> final Document doc = new Document();
>> for( final Map.Entry< String, String[] > entry :
>> fd.entrySet() )
>> {
>> final String key = entry.getKey();
>> String[] value = entry.getValue();
>> showStuff( value );
>>
>> /**  This call indexes successfully **/
>> final FacetField newFF = new FacetField(
>> key, value[0] );
>>
>> /**
>>* This call will not index anything if
>> the value String array
>>* has more than one element.
>>*final FacetField newFF = new
>> FacetField( key, value );
>>*/
>> doc.add( newFF );
>> }
>>
>> try
>> {
>> final Document theBuildDoc =
>> configFacetsHandle.
>> build( taxoWriter, doc );
>> indexWriter.addDocument( theBuildDoc );
>> indexWriter.addDocument(
>> configFacetsHandle.buil
>> d( taxoWriter, doc ) );
>> }
>> catch( IOException ioe )
>> {
>> eMsg.append( method );
>> eMsg.append(  " failed with the exception
>> "
>> );
>> eMsg.append( ioe.toString() );
>> return constantValuesInterface.FAILURE;
>> }
>> }
>>
>> ***End of code sample***
>>
>> regards,
>>
>> West Suhanic
>>
>
>


Re: A question about FacetField constructor

2014-06-22 Thread Shai Erera
Reply wasn't sent to the list.
On Jun 22, 2014 8:15 PM, "Shai Erera"  wrote:

> Can you post an example which demonstrates the problem? It's also
> interesting how you count the facets, eg do you use a TaxonomyFacets object
> or something else?
>
> Have you looked at the facet demo code? It contains examples for using
> hierarchical facets.
>
> Shai
> On Jun 22, 2014 8:08 PM, "west suhanic"  wrote:
>
>> Hello:
>>
>> >What do you mean by does not index anything?
>>
>> When I do a search the value returned for the "dim" set to "Publish Date"
>> is null. If I pass through value[0] the publish date year is returned by
>> the search.
>>
>> setHierarchical was called.
>>
>> When a String[] with more than one element is passed an exception is not
>> thrown.
>>
>> I am open to all suggestions as to what I am missing.
>>
>> regards,
>>
>> west suhanic
>>
>>
>> On Sun, Jun 22, 2014 at 3:23 AM, Shai Erera  wrote:
>>
>>> What do you mean by does not index anything? Do you get an exception
>>> when you add a String[] with more than one element?
>>>
>>> You should probably call conf.setHierarchical(dimension), but if you
>>> don't do that you should receive an IllegalArgumentException telling you to
>>> do that...
>>>
>>> Shai
>>>
>>>
>>> On Sun, Jun 22, 2014 at 6:34 AM, west suhanic 
>>> wrote:
>>>
 Hello All:

 I am building sample code using lucene v4.8.1 to explore
 the new facet API. The problem I am having is that if I pass
 a populated string array nothing gets indexed while if
 I pass only the first element of the string array that value gets
 indexed.
 The code found below shows the case that works and the case that does
 not
 work. What am I doing wrong?

 Start of code sample*

 void showStuff( String... va )
 {
   /** This code permits out the contents of va
 successfully.**/
   for( int ii = 0 ; ii < va.length ; ii++ )
   System.out.println( "value[" + ii + "] " + va[ii]
 );
 }

 for( final Map< String, String[] > fd : allFacetData )
 {

 final Document doc = new Document();
 for( final Map.Entry< String, String[] > entry :
 fd.entrySet() )
 {
 final String key = entry.getKey();
 String[] value = entry.getValue();
 showStuff( value );

 /**  This call indexes successfully **/
 final FacetField newFF = new FacetField(
 key, value[0] );

 /**
* This call will not index anything
 if
 the value String array
* has more than one element.
*final FacetField newFF = new
 FacetField( key, value );
*/
 doc.add( newFF );
 }

 try
 {
 final Document theBuildDoc =
 configFacetsHandle.
 build( taxoWriter, doc );
 indexWriter.addDocument( theBuildDoc );
 indexWriter.addDocument(
 configFacetsHandle.buil
 d( taxoWriter, doc ) );
 }
 catch( IOException ioe )
 {
 eMsg.append( method );
 eMsg.append(  " failed with the
 exception "
 );
 eMsg.append( ioe.toString() );
 return constantValuesInterface.FAILURE;
 }
 }

 ***End of code sample***

 regards,

 West Suhanic

>>>
>>>
>>


Re: Lucene Facets Module 4.8.1

2014-06-22 Thread Jigar Shah
On commenting

//config.setIndexFieldName("CITY", "city"); at search time, this is before
i do, getTopChildren(...)

I get following exception.

Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
at
org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts.count(FastTaxonomyFacetCounts.java:74)
[lucene-facet-4.8.1.jar:4.8.1 1594670 - rmuir - 2014-05-14 19:23:23]
at
org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts.(FastTaxonomyFacetCounts.java:49)
[lucene-facet-4.8.1.jar:4.8.1 1594670 - rmuir - 2014-05-14 19:23:23]
at
org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts.(FastTaxonomyFacetCounts.java:39)
[lucene-facet-4.8.1.jar:4.8.1 1594670 - rmuir - 2014-05-14 19:23:23]
at
org.apache.lucene.facet.DrillSideways.buildFacetsResult(DrillSideways.java:110)
[lucene-facet-4.8.1.jar:4.8.1 1594670 - rmuir - 2014-05-14 19:23:23]
at org.apache.lucene.facet.DrillSideways.search(DrillSideways.java:177)
[lucene-facet-4.8.1.jar:4.8.1 1594670 - rmuir - 2014-05-14 19:23:23]
at org.apache.lucene.facet.DrillSideways.search(DrillSideways.java:203)
[lucene-facet-4.8.1.jar:4.8.1 1594670 - rmuir - 2014-05-14 19:23:23]

Application level excepitons.
...
...



On Sat, Jun 21, 2014 at 10:56 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:

> Are you sure it's the same FacetsConfig at search time?  Because the
> exception implies your CITY field didn't have
> config.setIndexFieldName("CITY", "city") called.
>
> Or, can you try commenting out 'config.setIndexFieldName("CITY",
> "city")' at index time and see if the exception still happens?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Sat, Jun 21, 2014 at 1:08 AM, Jigar Shah  wrote:
> > Thanks for helping me.
> >
> > Yes, i did couple of things:
> >
> > Below is simple code for indexing which i use.
> >
> > TrackingIndexWriter nrtWriter
> > DirectoryTaxonomyWriter taxoWriter = ...
> > 
> > FacetsConfig config = new FacetConfig();
> > config.setHierarchical("CITY", true)
> > config.setMultiValued("CITY", true);
> > config.setIndexFieldName("CITY","city") // I kept dimName different from
> > indexFieldName
> > 
> > Added indexing searchable fields...
> > 
> >
> > doc.add( new FacetField("CITY", "India", "Gujarat", "Vadodara" ))
> > doc.add( new FacetField("CITY", "India", "Gujarat", "Ahmedabad" ))
> >
> >  nrtWriter.addDocument(config.build(taxoWriter, doc));
> >
> > Below is code which i use for searching
> >
> > TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoWriter);
> >
> > Query query = ...
> > IndexSearcher searcher = ...
> > DrillDownQuery ddq = new DrillDownQuery(config, query);
> > DrillSideways ds = new DrillSideways(searcher, config, taxoReader); //
> > Config object is same which i created before
> > DrillSidewaysResult result = ds.search(query, null, null, start + limit,
> > null, true, true)
> > ...
> > Facets f = result.facets
> > FacetResult fr = f.getTopChildren(5, "CITY") [Exception is geneated]//
> > Didn't perform any drill-down,really, its just original query for first
> > time, but wrapped in DrillDownQuery.
> >
> > ... and below gives me empty collection.
> >
> > List frs= f.getAllDims(5)
> >
> > I debug source code and found, it internally calls
> >
> > FastTaxonomyFacetCounts(indexFieldName, taxoReader, config) // Config
> > object is same which i created before
> >
> > which then calls
> >
> > IntTaxonomyFacets(indexFieldName, taxoReader, config) // Config object is
> > same which i created before
> >
> > And during this calls the value of indexFieldName is "$facets defined by
> > constant  'public static final String DEFAULT_INDEX_FIELD_NAME =
> "$facets";'
> > in FacetsConfig.
> >
> > My question is if i am using same FacetsConfig while indexing and
> > searching. why its not identifying correct name of field, and goes for
> > "$facets"
> >
> > Please correct me if i understood wrong. or correct way to solve above
> > problem.
> >
> > Many Thanks.
> > Jigar Shah.
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: Lucene Facets Module 4.8.1

2014-06-22 Thread Jigar Shah
FacetsConfig Object is same.

(indexConfig == searchConfig) returns true.


On Sat, Jun 21, 2014 at 11:01 PM, Shai Erera  wrote:

> If you can, while in debug mode try to note the instance ID of the
> FacetsConfig, and assert it is indeed the same (i.e. indexConfig ==
> searchConfig).
>
> Shai
>
>
> On Sat, Jun 21, 2014 at 8:26 PM, Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
> > Are you sure it's the same FacetsConfig at search time?  Because the
> > exception implies your CITY field didn't have
> > config.setIndexFieldName("CITY", "city") called.
> >
> > Or, can you try commenting out 'config.setIndexFieldName("CITY",
> > "city")' at index time and see if the exception still happens?
> >
> > Mike McCandless
> >
> > http://blog.mikemccandless.com
> >
> >
> > On Sat, Jun 21, 2014 at 1:08 AM, Jigar Shah 
> wrote:
> > > Thanks for helping me.
> > >
> > > Yes, i did couple of things:
> > >
> > > Below is simple code for indexing which i use.
> > >
> > > TrackingIndexWriter nrtWriter
> > > DirectoryTaxonomyWriter taxoWriter = ...
> > > 
> > > FacetsConfig config = new FacetConfig();
> > > config.setHierarchical("CITY", true)
> > > config.setMultiValued("CITY", true);
> > > config.setIndexFieldName("CITY","city") // I kept dimName different
> from
> > > indexFieldName
> > > 
> > > Added indexing searchable fields...
> > > 
> > >
> > > doc.add( new FacetField("CITY", "India", "Gujarat", "Vadodara" ))
> > > doc.add( new FacetField("CITY", "India", "Gujarat", "Ahmedabad" ))
> > >
> > >  nrtWriter.addDocument(config.build(taxoWriter, doc));
> > >
> > > Below is code which i use for searching
> > >
> > > TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoWriter);
> > >
> > > Query query = ...
> > > IndexSearcher searcher = ...
> > > DrillDownQuery ddq = new DrillDownQuery(config, query);
> > > DrillSideways ds = new DrillSideways(searcher, config, taxoReader); //
> > > Config object is same which i created before
> > > DrillSidewaysResult result = ds.search(query, null, null, start +
> limit,
> > > null, true, true)
> > > ...
> > > Facets f = result.facets
> > > FacetResult fr = f.getTopChildren(5, "CITY") [Exception is geneated]//
> > > Didn't perform any drill-down,really, its just original query for first
> > > time, but wrapped in DrillDownQuery.
> > >
> > > ... and below gives me empty collection.
> > >
> > > List frs= f.getAllDims(5)
> > >
> > > I debug source code and found, it internally calls
> > >
> > > FastTaxonomyFacetCounts(indexFieldName, taxoReader, config) // Config
> > > object is same which i created before
> > >
> > > which then calls
> > >
> > > IntTaxonomyFacets(indexFieldName, taxoReader, config) // Config object
> is
> > > same which i created before
> > >
> > > And during this calls the value of indexFieldName is "$facets defined
> by
> > > constant  'public static final String DEFAULT_INDEX_FIELD_NAME =
> > "$facets";'
> > > in FacetsConfig.
> > >
> > > My question is if i am using same FacetsConfig while indexing and
> > > searching. why its not identifying correct name of field, and goes for
> > > "$facets"
> > >
> > > Please correct me if i understood wrong. or correct way to solve above
> > > problem.
> > >
> > > Many Thanks.
> > > Jigar Shah.
> >
> > -
> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> > For additional commands, e-mail: java-user-h...@lucene.apache.org
> >
> >
>


Re: Lucene Facets Module 4.8.1

2014-06-22 Thread Jigar Shah
Thanks very much for help.

Ok, so i believe my rest of the code is correct, i just need to

"Initialize a FacetsCounts per indexFieldName"

How do i do this, is there some TestCase, or some sample code available ?

Thanks,






On Sun, Jun 22, 2014 at 10:05 PM, Shai Erera  wrote:

> OK I see. I think the code works OK though. It documents that you should
> call the other constructor if you specify a custom indexFieldName for some
> of the dimensions.
>
> Currently if you index dimensions under different indexFieldNames, you
> should initialize a FacetsCounts per indexFieldName. There's no way for the
> default ctor of FastTaxonomyFacetCounts to determine which indexFieldName
> to use as it doesn't know which dimensions you're going to ask to count.
>
> Hope that helps.
>
> Shai
>
>
> On Sun, Jun 22, 2014 at 4:05 PM, Jigar Shah  wrote:
>
> > I will try to dig more on your suggestions, and also assert FacetsConfig
> > object.
> >
> > While debugging i found, buildFacetsResult(...) method from
> > DrillSideways.java
> >
> > Its internally invoking following constructor from
> > FastTaxonomyFacetCounts.java
> >
> > FastTaxonomyFacetCounts() {
> > this(FacetsConfig.DEFAULT_INDEX_FIELD_NAME, taxoReader, config, fc);
> //
> > FacetsConfig.DEFAULT_INDEX_FIELD_NAME is '$facets'
> > }
> >
> > Shouldn't it invoke following constructor with correct indexFieldName ?
> In
> > my case indexFieldName as 'city' which has dimension 'CITY'.
> >
> >  FastTaxonomyFacetCounts(String indexFieldName, TaxonomyReader
> taxoReader,
> > FacetsConfig config, FacetsCollector fc) throws IOException {
> > super(indexFieldName, taxoReader, config);
> > ...
> > }
> >
> > Thanks
> > Jigar Shah.
> >
> >
> >
> > On Sat, Jun 21, 2014 at 11:01 PM, Shai Erera  wrote:
> >
> > > If you can, while in debug mode try to note the instance ID of the
> > > FacetsConfig, and assert it is indeed the same (i.e. indexConfig ==
> > > searchConfig).
> > >
> > > Shai
> > >
> > >
> > > On Sat, Jun 21, 2014 at 8:26 PM, Michael McCandless <
> > > luc...@mikemccandless.com> wrote:
> > >
> > > > Are you sure it's the same FacetsConfig at search time?  Because the
> > > > exception implies your CITY field didn't have
> > > > config.setIndexFieldName("CITY", "city") called.
> > > >
> > > > Or, can you try commenting out 'config.setIndexFieldName("CITY",
> > > > "city")' at index time and see if the exception still happens?
> > > >
> > > > Mike McCandless
> > > >
> > > > http://blog.mikemccandless.com
> > > >
> > > >
> > > > On Sat, Jun 21, 2014 at 1:08 AM, Jigar Shah 
> > > wrote:
> > > > > Thanks for helping me.
> > > > >
> > > > > Yes, i did couple of things:
> > > > >
> > > > > Below is simple code for indexing which i use.
> > > > >
> > > > > TrackingIndexWriter nrtWriter
> > > > > DirectoryTaxonomyWriter taxoWriter = ...
> > > > > 
> > > > > FacetsConfig config = new FacetConfig();
> > > > > config.setHierarchical("CITY", true)
> > > > > config.setMultiValued("CITY", true);
> > > > > config.setIndexFieldName("CITY","city") // I kept dimName different
> > > from
> > > > > indexFieldName
> > > > > 
> > > > > Added indexing searchable fields...
> > > > > 
> > > > >
> > > > > doc.add( new FacetField("CITY", "India", "Gujarat", "Vadodara" ))
> > > > > doc.add( new FacetField("CITY", "India", "Gujarat", "Ahmedabad" ))
> > > > >
> > > > >  nrtWriter.addDocument(config.build(taxoWriter, doc));
> > > > >
> > > > > Below is code which i use for searching
> > > > >
> > > > > TaxonomyReader taxoReader = new
> DirectoryTaxonomyReader(taxoWriter);
> > > > >
> > > > > Query query = ...
> > > > > IndexSearcher searcher = ...
> > > > > DrillDownQuery ddq = new DrillDownQuery(config, query);
> > > > > DrillSideways ds = new DrillSideways(searcher, config, taxoReader);
> > //
> > > > > Config object is same which i created before
> > > > > DrillSidewaysResult result = ds.search(query, null, null, start +
> > > limit,
> > > > > null, true, true)
> > > > > ...
> > > > > Facets f = result.facets
> > > > > FacetResult fr = f.getTopChildren(5, "CITY") [Exception is
> > geneated]//
> > > > > Didn't perform any drill-down,really, its just original query for
> > first
> > > > > time, but wrapped in DrillDownQuery.
> > > > >
> > > > > ... and below gives me empty collection.
> > > > >
> > > > > List frs= f.getAllDims(5)
> > > > >
> > > > > I debug source code and found, it internally calls
> > > > >
> > > > > FastTaxonomyFacetCounts(indexFieldName, taxoReader, config) //
> Config
> > > > > object is same which i created before
> > > > >
> > > > > which then calls
> > > > >
> > > > > IntTaxonomyFacets(indexFieldName, taxoReader, config) // Config
> > object
> > > is
> > > > > same which i created before
> > > > >
> > > > > And during this calls the value of indexFieldName is "$facets
> defined
> > > by
> > > > > constant  'public static final String DEFAULT_INDEX_FIELD_NAME =
> > > > "$facets";'
> > > > > in FacetsConfig.
> > > > >
> > > > > My question is if i am