Re: Shingles behavior

2020-05-20 Thread Radu Gheorghe
Hi Alex, long time no see :)

I tried with sow, and that basically invalidates query-time shingles (it
only mathes mona OR lisa OR smile).

I'm using shingles at both index and query time as a substitute for pf2 and
pf3: the more shingles I match, the more relevant the document. Also,
higher order shingles naturally get lower frequencies, meaning they get a
"natural" boost.

Best regards,
Radu

joi, 21 mai 2020, 00:28 Alexandre Rafalovitch  a scris:

> Did you try it with 'sow' parameter both ways? I am not sure I fully
> understand the question, especially with shingling on both passes
> rather than just indexing one. But at least it is something to try and
> is one of the difference areas between Solr and ES.
>
> Regards,
>Alex.
>
> On Tue, 19 May 2020 at 05:59, Radu Gheorghe 
> wrote:
> >
> > Hello Solr users,
> >
> > I’m quite puzzled about how shingles work. The way tokens are analysed
> looks fine to me, but the query seems too restrictive.
> >
> > Here’s the sample use-case. I have three documents:
> >
> > mona lisa smile
> > mona lisa
> > mona
> >
> > I have a shingle filter set up like this (both index- and query-time):
> >
> > >  maxShingleSize=“4”/>
> >
> > When I query for “Mona Lisa smile” (no quotes), I expect to get all
> three documents back, in that order. Because the first document matches all
> the terms:
> >
> > mona
> > mona lisa
> > mona lisa smile
> > lisa
> > lisa smile
> > smile
> >
> > And the second one matches only some, and the third document only
> matches one.
> >
> > Instead, I only get the first document back. That’s because the query
> expects all the “words” to match:
> >
> > > "parsedquery":"+DisjunctionMaxQuery+shingle_field:mona
> +usage_query_view_tags:lisa +shingle_field:smile) (+shingle_field:mona
> +shingle_field:lisa smile) (+shingle_field:mona lisa +shingle_field:smile)
> shingle_field:mona lisa smile)))”,
> >
> > The query above is generated by the Edismax query parser, when I’m using
> “shingle_field” as “df”.
> >
> > Is there a way to get “any of the words” to match? I’ve tried all the
> options I can think of:
> > - different query parsers
> > - q.OP=OR
> > - mm=0 (or 1 or 0% or 10% or…)
> >
> > Nothing seems to change the parsed query from the above.
> >
> > I’ve compared this to the behaviour of Elasticsearch. There, I get “OR”
> by default, and minimum_should_match works as expected. The only difference
> I see between the two, on the analysis side, is that tokens start at 0 in
> Elasticsearch and at 1 in Solr. I doubt that’s the problem, because I see
> that the default “text_en”, for example, also starts at position 1.
> >
> > Is it just a bug that mm doesn’t work in the context of shingles? Or is
> there a workaround?
> >
> > Thanks and best regards,
> > Radu
>


Re: Need help on handling large size of index.

2020-05-20 Thread Shawn Heisey

On 5/20/2020 11:43 AM, Modassar Ather wrote:

Can you please help me with following few questions?

- What is the ideal index size per shard?


We have no way of knowing that.  A size that works well for one index 
use case may not work well for another, even if the index size in both 
cases is identical.  Determining the ideal shard size requires 
experimentation.


https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/


- The optimisation takes lot of time and IOPs to complete. Will
increasing the number of shards help in reducing the optimisation time and
IOPs?


No, changing the number of shards will not help with the time required 
to optimize, and might make it slower.  Increasing the speed of the 
disks won't help either.  Optimizing involves a lot more than just 
copying data -- it will never use all the available disk bandwidth of 
modern disks.  SolrCloud does optimizes of the shard replicas making up 
a full collection sequentially, not simultaneously.



- We are planning to reduce each shard index size to 30GB and the entire
3.5 TB index will be distributed across more shards. In this case to almost
70+ shards. Will this help?


Maybe.  Maybe not.  You'll have to try it.  If you increase the number 
of shards without adding additional servers, I would expect things to 
get worse, not better.



Kindly share your thoughts on how best we can use Solr with such a large
index size.


Something to keep in mind -- memory is the resource that makes the most 
difference in performance.  Buying enough memory to get decent 
performance out of an index that big would probably be very expensive. 
You should probably explore ways to make your index smaller.  Another 
idea is to split things up so the most frequently accessed search data 
is in a relatively small index and lives on beefy servers, and data used 
for less frequent or data-mining queries (where performance doesn't 
matter as much) can live on less expensive servers.


Thanks,
Shawn


MIGRATE without split.key?

2020-05-20 Thread slly
Hello everyone,


I want to migrate data from one collection to another with MIGRATE API, but if 
this parameter split.key is not specified, it cannot be executed. 


Why can't we remove this limitation? Is there a better way to migrate data?


Thanks.

Solr Atomic update change value and field name

2020-05-20 Thread Hup Chen
I am new to Solr. I tried to do Atomic update by using .json file update. 
$SOLR/bin/post not only changing field values, but field name also has become 
"fieldname.set", for instance, "price" become "price.set".  Update by curl 
/update handler was working well but since I have several millions of records, 
I can't update by calling curl several million times, that will be extremely 
slow.

Any help will be appreciated.


# /usr/local/solr/bin/solr version
8.5.1

# curl http://localhost:8983/solr/books/select?q=id%3A0371558727
"response":{"numFound":1,"start":0,"docs":[
  {
"id":"0371558727",
"price":19.0,
"_version_":1667214802265571328}]
}

# cat test.json
[
{"id":"0371558727",
 "price":{"set":19.95}
}
]

# /usr/local/solr/bin/post -p 8983 -c books test.json

# curl http://localhost:8983/solr/books/select?q=id%3A0371558727
"response":{"numFound":1,"start":0,"docs":[
  {
"id":"0371558727",
"price.set":[19.95],
"_version_":1667214933776924672}]
}




Re: Need help on handling large size of index.

2020-05-20 Thread Phill Campbell
In my world your index size is common.

Optimal Index size: Depends on what you are optimizing for. Query Speed? 
Hardware utilization? 
Optimizing the index is something I never do. We live with about 28% deletes. 
You should check your configuration for your merge policy.
I run 120 shards, and I am currently redesigning for 256 shards.
Increased sharding has helped reduce query response time, but surely there is a 
point where the colation of results starts to be the bottleneck.
I run the 120 shards on 90 r4.4xlarge instances with a replication factor of 3.

The things missing are:
What does your schema look like? I index around 120 fields per document.
What does your queries look like? Mine are so varied that caching never helps, 
the same query rarely comes through.
My system takes continuous updates, yours does not.

It is really up to you to experiment.

If you follow the development pattern of Design By Use (DBU) the first thing 
you do for solr and even for SQL is to come up with your queries first. Then 
design the schema. Then figure out how to distribute it for performance.

Oh, another thing, are you concerned about  availability? Do you have a 
replication factor > 1? Do you run those replicas in a different region for 
safety?
How many zookeepers are you running and where are they?

Lots of questions.

Regards

> On May 20, 2020, at 11:43 AM, Modassar Ather  wrote:
> 
> Hi,
> 
> Currently we have index of size 3.5 TB. These index are distributed across
> 12 shards under two cores. The size of index on each shards are almost
> equal.
> We do a delta indexing every week and optimise the index.
> 
> The server configuration is as follows.
> 
>   - Solr Version  : 6.5.1
>   - AWS instance type : r5a.16xlarge
>   - CPU(s)  : 64
>   - RAM  : 512GB
>   - EBS size  : 7 TB (For indexing as well as index optimisation.)
>   - IOPs  : 3 (For faster index optimisation)
> 
> 
> Can you please help me with following few questions?
> 
>   - What is the ideal index size per shard?
>   - The optimisation takes lot of time and IOPs to complete. Will
>   increasing the number of shards help in reducing the optimisation time and
>   IOPs?
>   - We are planning to reduce each shard index size to 30GB and the entire
>   3.5 TB index will be distributed across more shards. In this case to almost
>   70+ shards. Will this help?
>   - Will adding so many new shards increase the search response time and
>   possibly how much?
>   - If we have to increase the shards should we do it on a single larger
>   server or should do it on multiple small servers?
> 
> 
> Kindly share your thoughts on how best we can use Solr with such a large
> index size.
> 
> Best,
> Modassar



Re: Need help on handling large size of index.

2020-05-20 Thread Phill Campbell
In my world your index size is common.

Optimal Index size: Depends on what you are optimizing for. Query Speed? 
Hardware utilization? 
Optimizing the index is something I never do. We live with about 28% deletes. 
You should check your configuration for your merge policy.
I run 120 shards, and I am currently redesigning for 256 shards.
Increased sharding has helped reduce query response time, but surely there is a 
point where the colation of results starts to be the bottleneck.
I run the 120 shards on 90 r4.4xlarge instances with a replication factor of 3.

The things missing are:
What does your schema look like? I index around 120 fields per document.
What does your queries look like? Mine are so varied that caching never helps, 
the same query rarely comes through.
My system takes continuous updates, yours does not.

It is really up to you to experiment.

If you follow the development pattern of Design By Use (DBU) the first thing 
you do for solr and even for SQL is to come up with your queries first. Then 
design the schema. Then figure out how to distribute it for performance.

Oh, another thing, are you concerned about  availability? Do you have a 
replication factor > 1? Do you run those replicas in a different region for 
safety?
How many zookeepers are you running and where are they?

Lots of questions.

Regards

> On May 20, 2020, at 11:43 AM, Modassar Ather  wrote:
> 
> Hi,
> 
> Currently we have index of size 3.5 TB. These index are distributed across
> 12 shards under two cores. The size of index on each shards are almost
> equal.
> We do a delta indexing every week and optimise the index.
> 
> The server configuration is as follows.
> 
>  - Solr Version  : 6.5.1
>  - AWS instance type : r5a.16xlarge
>  - CPU(s)  : 64
>  - RAM  : 512GB
>  - EBS size  : 7 TB (For indexing as well as index optimisation.)
>  - IOPs  : 3 (For faster index optimisation)
> 
> 
> Can you please help me with following few questions?
> 
>  - What is the ideal index size per shard?
>  - The optimisation takes lot of time and IOPs to complete. Will
>  increasing the number of shards help in reducing the optimisation time and
>  IOPs?
>  - We are planning to reduce each shard index size to 30GB and the entire
>  3.5 TB index will be distributed across more shards. In this case to almost
>  70+ shards. Will this help?
>  - Will adding so many new shards increase the search response time and
>  possibly how much?
>  - If we have to increase the shards should we do it on a single larger
>  server or should do it on multiple small servers?
> 
> 
> Kindly share your thoughts on how best we can use Solr with such a large
> index size.
> 
> Best,
> Modassar



Re: This IndexSchema is not mutable. Solr 7.3.1

2020-05-20 Thread Shawn Heisey

On 5/20/2020 4:30 PM, Vincenzo D'Amore wrote:

another update. I think I found the problem.
This error is generated when I have defined add-schema-fields in
the updateRequestProcessorChain.

In other words you can have ClassicIndexSchemaFactory but (and make
sense) add-schema-fields has to be removed by
the updateRequestProcessorChain:


Absolutely correct.  The update processor that automatically adds 
unknown fields DOES require a mutable schema.  The classic schema will 
not work with it.


Thanks,
Shawn


Re: This IndexSchema is not mutable. Solr 7.3.1

2020-05-20 Thread Vincenzo D'Amore
Hi all,

another update. I think I found the problem.
This error is generated when I have defined add-schema-fields in
the updateRequestProcessorChain.

In other words you can have ClassicIndexSchemaFactory but (and make
sense) add-schema-fields has to be removed by
the updateRequestProcessorChain:

  

Any thought about this?



On Wed, May 20, 2020 at 11:27 PM Vincenzo D'Amore 
wrote:

> Hi Erick,
>
> thanks for the prompt support, I'm sure all the fields are defined (after
> all they are all strings and only 6).
>
> It seems that you cannot use CSV with ClassicIndexSchemaFactory
>
>
> On Wed, May 20, 2020 at 8:20 PM Erick Erickson 
> wrote:
>
>> It’s the _schema_ that’s not mutable. Which implies you have field
>> guessing turned _off_
>>
>> I’d take a look at the solr log, the error might be more informative. But
>> at a guess, you
>> need to define the fields you’re importing, namely id, name, surname,
>> gender, eyeColor and hairColor
>> in your schema.
>>
>> > On May 20, 2020, at 1:46 PM, Vincenzo D'Amore 
>> wrote:
>> >
>> > Hi all,
>> >
>> > I'm trying to import a csv file in solr
>> >
>> > id,name,surname,gender,eyeColor,hairColor
>> > 1,pippo,pluto,male,brown,brown
>> >
>> > I'm using this command
>> >
>> > curl '
>> >
>> http://localhost:8983/solr/videoid/update?commit=true=true=id,name,surname,gender,eyeColor,hairColor=,
>> '
>> > -H "Content-Type: application/csv" --data-binary @test.csv
>> >
>> > But receiving the following error:
>> >
>> > {
>> >  "responseHeader":{
>> >"status":400,
>> >"QTime":32},
>> >  "error":{
>> >"metadata":[
>> >  "error-class","org.apache.solr.common.SolrException",
>> >  "root-error-class","org.apache.solr.common.SolrException"],
>> >"msg":"This IndexSchema is not mutable.",
>> >"code":400}}
>> >
>> > Do you know why the Solr index should be mutable?
>> >
>> >
>> >
>> > --
>> > Vincenzo D'Amore
>>
>>
>
> --
> Vincenzo D'Amore
>
>

-- 
Vincenzo D'Amore


Re: when to use docvalue

2020-05-20 Thread Revas
Thanks, Erick. Its just when we enable both index=true and docValues=true,
it increases the index time by 2x atleast for full re-index.

On Wed, May 20, 2020 at 2:30 PM Erick Erickson 
wrote:

> Revas:
>
> Facet queries are just queries that are constrained by the total result
> set of your
> primary query, so the answer to that would be the same as speeding up
> regular
> queries. As far as range facets are concerned, I believe they _do_ use
> docValues,
> after all they have to answer the exact same question: For doc X in the
> result set,
> what is the value of field Y? The only difference is it has to bucket a
> bunch of them.
>
> Rahul: Please don;’t hijack threads, it makes it difficult to find things
> later. Start
> a separate e-mail thread.
>
> The answer to your question is, of course, “it depends” on a number of
> things and
> changes with the query. First of all, multivalued fields don’t qualify
> because
> docValues are a sorted set, meaning the return is sorted and deduplicated.
> So if
> the input has f values in it, b c d c d, what you’d get back from DV is b
> c d.
>
> So let’s go with primitive, single-valued types. It still depends, but
> Solr does
> the right thing, or tries. Here’s the scoop. stored fields for any single
> doc are
> stored as a contiguous, compressed bit of memory. So if any _one_ field
> needs
> to be read from the stored data, the entire block is decompressed and Solr
> will
> preferentially fetch the value from the decompressed data as it’s pretty
> certain
> to be at least as cheap as fetching from DV. However, the reverse is true
> if _all_
> the returned values are single-valued DV fields. Then it’s more efficient
> to fetch
> the DV values as they’re MMapped, and won’t cost the seek-and-decompress
> cycle.
>
> Unless space is a real consideration for you, I’d set both index and
> docValues to
> true…
>
> Best,
> Erick
>
> > On May 20, 2020, at 10:45 AM, Rahul Goswami 
> wrote:
> >
> > Eric,
> > Thanks for that explanation. I have a follow up question on that. I find
> > the scenario of stored=true and docValues=true to be tricky at times...
> > would like to know when is each of these scenarios preferred over the
> other
> > two for primitive datatypes:
> >
> > 1) stored=true and docValues=false
> > 2) stored=false and docValues=true
> > 3) stored=true and docValues=true
> >
> > Thanks,
> > Rahul
> >
> > On Tue, May 19, 2020 at 5:55 PM Erick Erickson 
> > wrote:
> >
> >> They are _absolutely_ able to be used together. Background:
> >>
> >> “In the bad old days”, there was no docValues. So whenever you needed
> >> to facet/sort/group/use function queries Solr (well, Lucene) had to take
> >> the inverted structure resulting from “index=true” and “uninvert” it on
> the
> >> Java heap.
> >>
> >> docValues essentially does the “uninverting” at index time and puts
> >> that structure in a separate file for each segment. So rather than
> uninvert
> >> the index on the heap, Lucene can just read it in from disk in
> >> MMapDirectory
> >> (i.e. OS) memory space.
> >>
> >> The downside is that your index will be bigger when you do both, that is
> >> the
> >> size on disk will be bigger. But, it’ll be much faster to load, much
> >> faster to
> >> autowarm, and will move the structures necessary to do
> faceting/sorting/etc
> >> into OS memory where the garbage collection is vastly more efficient
> than
> >> Javas.
> >>
> >> And frankly I don’t think the increased size on disk is a downside.
> You’ll
> >> have
> >> to have the memory anyway, and having it used on the OS memory space is
> >> so much more efficient than on Java’s heap that it’s a win-win IMO.
> >>
> >> Oh, and if you never sort/facet/group/use function queries, then the
> >> docValues structures are never even read into MMapDirectory space.
> >>
> >> So yes, freely do both.
> >>
> >> Best,
> >> Erick
> >>
> >>
> >>> On May 19, 2020, at 5:41 PM, matthew sporleder 
> >> wrote:
> >>>
> >>> You can index AND docvalue?  For some reason I thought they were
> >> exclusive
> >>>
> >>> On Tue, May 19, 2020 at 5:36 PM Erick Erickson <
> erickerick...@gmail.com>
> >> wrote:
> 
>  Yes. You should also index them….
> 
>  Here’s the way I think of it.
> 
>  For questions “For term X, which docs contain that value?” means
> >> index=true. This is a search.
> 
>  For questions “Does doc X have value Y in field Z”, means
> >> docValues=true.
> 
>  what’s the difference? Well, the first one is to get the result set.
> >> The second is for, given a result set,
>  count/sort/whatever.
> 
>  fq clauses are searches, so index=true.
> 
>  sorting, faceting, grouping and function queries  are “for each doc in
> >> the result set, what values does field Y contain?”
> 
>  Maybe that made things clear as mud, but it’s the way I think of it ;)
> 
>  Best,
>  Erick
> 
> 
> 
>  fq clauses are searches. Indexed=true is for searching.
> 
> 

Re: Shingles behavior

2020-05-20 Thread Alexandre Rafalovitch
Did you try it with 'sow' parameter both ways? I am not sure I fully
understand the question, especially with shingling on both passes
rather than just indexing one. But at least it is something to try and
is one of the difference areas between Solr and ES.

Regards,
   Alex.

On Tue, 19 May 2020 at 05:59, Radu Gheorghe  wrote:
>
> Hello Solr users,
>
> I’m quite puzzled about how shingles work. The way tokens are analysed looks 
> fine to me, but the query seems too restrictive.
>
> Here’s the sample use-case. I have three documents:
>
> mona lisa smile
> mona lisa
> mona
>
> I have a shingle filter set up like this (both index- and query-time):
>
> >  > maxShingleSize=“4”/>
>
> When I query for “Mona Lisa smile” (no quotes), I expect to get all three 
> documents back, in that order. Because the first document matches all the 
> terms:
>
> mona
> mona lisa
> mona lisa smile
> lisa
> lisa smile
> smile
>
> And the second one matches only some, and the third document only matches one.
>
> Instead, I only get the first document back. That’s because the query expects 
> all the “words” to match:
>
> > "parsedquery":"+DisjunctionMaxQuery+shingle_field:mona 
> > +usage_query_view_tags:lisa +shingle_field:smile) (+shingle_field:mona 
> > +shingle_field:lisa smile) (+shingle_field:mona lisa +shingle_field:smile) 
> > shingle_field:mona lisa smile)))”,
>
> The query above is generated by the Edismax query parser, when I’m using 
> “shingle_field” as “df”.
>
> Is there a way to get “any of the words” to match? I’ve tried all the options 
> I can think of:
> - different query parsers
> - q.OP=OR
> - mm=0 (or 1 or 0% or 10% or…)
>
> Nothing seems to change the parsed query from the above.
>
> I’ve compared this to the behaviour of Elasticsearch. There, I get “OR” by 
> default, and minimum_should_match works as expected. The only difference I 
> see between the two, on the analysis side, is that tokens start at 0 in 
> Elasticsearch and at 1 in Solr. I doubt that’s the problem, because I see 
> that the default “text_en”, for example, also starts at position 1.
>
> Is it just a bug that mm doesn’t work in the context of shingles? Or is there 
> a workaround?
>
> Thanks and best regards,
> Radu


Re: This IndexSchema is not mutable. Solr 7.3.1

2020-05-20 Thread Vincenzo D'Amore
Hi Erick,

thanks for the prompt support, I'm sure all the fields are defined (after
all they are all strings and only 6).

It seems that you cannot use CSV with ClassicIndexSchemaFactory


On Wed, May 20, 2020 at 8:20 PM Erick Erickson 
wrote:

> It’s the _schema_ that’s not mutable. Which implies you have field
> guessing turned _off_
>
> I’d take a look at the solr log, the error might be more informative. But
> at a guess, you
> need to define the fields you’re importing, namely id, name, surname,
> gender, eyeColor and hairColor
> in your schema.
>
> > On May 20, 2020, at 1:46 PM, Vincenzo D'Amore 
> wrote:
> >
> > Hi all,
> >
> > I'm trying to import a csv file in solr
> >
> > id,name,surname,gender,eyeColor,hairColor
> > 1,pippo,pluto,male,brown,brown
> >
> > I'm using this command
> >
> > curl '
> >
> http://localhost:8983/solr/videoid/update?commit=true=true=id,name,surname,gender,eyeColor,hairColor=,
> '
> > -H "Content-Type: application/csv" --data-binary @test.csv
> >
> > But receiving the following error:
> >
> > {
> >  "responseHeader":{
> >"status":400,
> >"QTime":32},
> >  "error":{
> >"metadata":[
> >  "error-class","org.apache.solr.common.SolrException",
> >  "root-error-class","org.apache.solr.common.SolrException"],
> >"msg":"This IndexSchema is not mutable.",
> >"code":400}}
> >
> > Do you know why the Solr index should be mutable?
> >
> >
> >
> > --
> > Vincenzo D'Amore
>
>

-- 
Vincenzo D'Amore


Re: What is the logical order of applying sorts in SOLR?

2020-05-20 Thread Alexandre Rafalovitch
If you use sort, you are basically ignoring relevancy (unless you put
that into sort). Which you seem to know as your example uses FQ.

Do you see performance drop on non-clustered or clustered Solr?
Because, I would not be surprised if, for clustered node, all the
results need to be brought into one place to sort even if only 10 (of
say 100) would be sent back, where without sort, each node is asked
for their "top X" matches and others are never even sent. That would
be my working theory anyway, I am not deep into milti-path mode the
cluster code does.

Regards,
   Alex.

On Mon, 11 May 2020 at 15:16, Stephen Lewis Bianamara
 wrote:
>
> Hi SOLR Community,
>
> What is the order of operations which SOLR applies to sorting? I've
> observed many times and across SOLR versions that a restrictive filter with
> a sort takes an extremely long time to return, suggesting to me that the
> SORT is applied before the filter.
>
> An example situation is querying for fq:Foo=Bar vs querying for fq:Foo=Bar
> sort by Id desc. I've observed over many SOLR versions and collections that
> the former is orders of magnitude cheaper and quicker to respond, even when
> the result set is tiny (10-100).
>
> Does anyone in this forum know whether this is the default behavior and
> whether there is any way through the API or SOLR configuration to apply
> sorts after filters?
>
> Thanks,
> Stephen


Re: Large query size in Solr 8.3.0

2020-05-20 Thread Alexandre Rafalovitch
Does this actually work? This individual ID matching feels very
fragile attempt at enforcing the sort order and maybe represents an
architectural issue. Maybe you need to do some joins or graph walking
instead. Or, more likely, you would benefit from over-fetching and
just sorting on the ids on the frontend, since you have those IDs
already. You are over-fetching already anyway (rows=250), so you don't
seem to worry that much about payload size.

But, apart from that:
1) Switch from GET to POST
2) 'fl' field list and others after it are probably not very mutable,
this can go into defaults for the request handler (custom one perhaps)
3) You don't seem to use filter queries. But you also have a lot of
binary flags that may benefit from being pushed into 'fq' and improve
caching/minimize score calculations. You could also have not-cached
FQs if you think they will not be reused
4) If you have sets of params that repeat often but not always, you
could do some variable substitutions to loop them in with paramSets
5) Move the sorting query into a boost query, just for clarity of intent

Regards,
  Alex.


On Tue, 19 May 2020 at 10:16, vishal patel
 wrote:
>
>
> Which query parser is used if my query length is large?
> My query is 
> https://drive.google.com/file/d/1P609VQReKM0IBzljvG2PDnyJcfv1P3Dz/view
>
>
> Regards,
> Vishal Patel


json faceting - Terms faceting and EnumField

2020-05-20 Thread Ponnuswamy, Poornima (GE Healthcare)
Hello,

We have solr 6.6 version.
Below is the field and field type that is defined in solr schema.



Below is the configuration for the enum
   

  servicerequestcorrective
  servicerequestplanned
  servicerequestinstallationandupgrade
  servicerequestrecall
  servicerequestother
  servicerequestinquiry
  servicerequestproactive
  servicerequestsystemupdate
  servicerequesticenteradmin
  servicerequestonwatch
  servicerequestfmi
  servicerequestapplication
   

When I try to invoke using the below call, I am getting error
http://localhost:8983/solr/activity01us/select?={ServiceRequestTypeCode:{type:terms,
 field:ServiceRequestTypeCode, limit:10}}=on=on=json=*
"Expected numeric field type 
:ServiceRequestTypeCode{type=ServiceRequestTypeCode,properties=indexed,stored,omitNorms,omitTermFreqAndPositions}"

But when I try to do as below it works fine.

http://localhost:8983/solr/activity01us/select?facet.field=ServiceRequestTypeCode=on=on=*:*=json

I would like to use json facet as it would help me in subfaceting.

Any help would be appreciated


Thanks,
Poornima



Re: when to use docvalue

2020-05-20 Thread Erick Erickson
Revas:

Facet queries are just queries that are constrained by the total result set of 
your
primary query, so the answer to that would be the same as speeding up regular
queries. As far as range facets are concerned, I believe they _do_ use 
docValues,
after all they have to answer the exact same question: For doc X in the result 
set,
what is the value of field Y? The only difference is it has to bucket a bunch 
of them.

Rahul: Please don;’t hijack threads, it makes it difficult to find things 
later. Start 
a separate e-mail thread.

The answer to your question is, of course, “it depends” on a number of things 
and
changes with the query. First of all, multivalued fields don’t qualify because
docValues are a sorted set, meaning the return is sorted and deduplicated. So if
the input has f values in it, b c d c d, what you’d get back from DV is b c d.

So let’s go with primitive, single-valued types. It still depends, but Solr does
the right thing, or tries. Here’s the scoop. stored fields for any single doc 
are
stored as a contiguous, compressed bit of memory. So if any _one_ field needs
to be read from the stored data, the entire block is decompressed and Solr will
preferentially fetch the value from the decompressed data as it’s pretty certain
to be at least as cheap as fetching from DV. However, the reverse is true if 
_all_
the returned values are single-valued DV fields. Then it’s more efficient to 
fetch
the DV values as they’re MMapped, and won’t cost the seek-and-decompress cycle.

Unless space is a real consideration for you, I’d set both index and docValues 
to
true…

Best,
Erick

> On May 20, 2020, at 10:45 AM, Rahul Goswami  wrote:
> 
> Eric,
> Thanks for that explanation. I have a follow up question on that. I find
> the scenario of stored=true and docValues=true to be tricky at times...
> would like to know when is each of these scenarios preferred over the other
> two for primitive datatypes:
> 
> 1) stored=true and docValues=false
> 2) stored=false and docValues=true
> 3) stored=true and docValues=true
> 
> Thanks,
> Rahul
> 
> On Tue, May 19, 2020 at 5:55 PM Erick Erickson 
> wrote:
> 
>> They are _absolutely_ able to be used together. Background:
>> 
>> “In the bad old days”, there was no docValues. So whenever you needed
>> to facet/sort/group/use function queries Solr (well, Lucene) had to take
>> the inverted structure resulting from “index=true” and “uninvert” it on the
>> Java heap.
>> 
>> docValues essentially does the “uninverting” at index time and puts
>> that structure in a separate file for each segment. So rather than uninvert
>> the index on the heap, Lucene can just read it in from disk in
>> MMapDirectory
>> (i.e. OS) memory space.
>> 
>> The downside is that your index will be bigger when you do both, that is
>> the
>> size on disk will be bigger. But, it’ll be much faster to load, much
>> faster to
>> autowarm, and will move the structures necessary to do faceting/sorting/etc
>> into OS memory where the garbage collection is vastly more efficient than
>> Javas.
>> 
>> And frankly I don’t think the increased size on disk is a downside. You’ll
>> have
>> to have the memory anyway, and having it used on the OS memory space is
>> so much more efficient than on Java’s heap that it’s a win-win IMO.
>> 
>> Oh, and if you never sort/facet/group/use function queries, then the
>> docValues structures are never even read into MMapDirectory space.
>> 
>> So yes, freely do both.
>> 
>> Best,
>> Erick
>> 
>> 
>>> On May 19, 2020, at 5:41 PM, matthew sporleder 
>> wrote:
>>> 
>>> You can index AND docvalue?  For some reason I thought they were
>> exclusive
>>> 
>>> On Tue, May 19, 2020 at 5:36 PM Erick Erickson 
>> wrote:
 
 Yes. You should also index them….
 
 Here’s the way I think of it.
 
 For questions “For term X, which docs contain that value?” means
>> index=true. This is a search.
 
 For questions “Does doc X have value Y in field Z”, means
>> docValues=true.
 
 what’s the difference? Well, the first one is to get the result set.
>> The second is for, given a result set,
 count/sort/whatever.
 
 fq clauses are searches, so index=true.
 
 sorting, faceting, grouping and function queries  are “for each doc in
>> the result set, what values does field Y contain?”
 
 Maybe that made things clear as mud, but it’s the way I think of it ;)
 
 Best,
 Erick
 
 
 
 fq clauses are searches. Indexed=true is for searching.
 
 sort
 
> On May 19, 2020, at 4:00 PM, matthew sporleder 
>> wrote:
> 
> I have quite a few numeric / meta-data type fields in my schema and
> pretty much only use them in fq=, sort=, and friends.  Should I always
> use DocValue on these if i never plan to q=search: on them?  Are there
> any drawbacks?
> 
> Thanks,
> Matt
 
>> 
>> 



Re: This IndexSchema is not mutable. Solr 7.3.1

2020-05-20 Thread Erick Erickson
It’s the _schema_ that’s not mutable. Which implies you have field guessing 
turned _off_

I’d take a look at the solr log, the error might be more informative. But at a 
guess, you
need to define the fields you’re importing, namely id, name, surname, gender, 
eyeColor and hairColor
in your schema.

> On May 20, 2020, at 1:46 PM, Vincenzo D'Amore  wrote:
> 
> Hi all,
> 
> I'm trying to import a csv file in solr
> 
> id,name,surname,gender,eyeColor,hairColor
> 1,pippo,pluto,male,brown,brown
> 
> I'm using this command
> 
> curl '
> http://localhost:8983/solr/videoid/update?commit=true=true=id,name,surname,gender,eyeColor,hairColor=,'
> -H "Content-Type: application/csv" --data-binary @test.csv
> 
> But receiving the following error:
> 
> {
>  "responseHeader":{
>"status":400,
>"QTime":32},
>  "error":{
>"metadata":[
>  "error-class","org.apache.solr.common.SolrException",
>  "root-error-class","org.apache.solr.common.SolrException"],
>"msg":"This IndexSchema is not mutable.",
>"code":400}}
> 
> Do you know why the Solr index should be mutable?
> 
> 
> 
> -- 
> Vincenzo D'Amore



Haystack is Back! Not just one - but three search conferences

2020-05-20 Thread Charlie Hull

Hi all,

So there's no Haystack in Charlottesville this year - but we've done our 
very best to bring you some of the talks and training we planned online 
- find out more at 
https://opensourceconnections.com/blog/2020/05/18/haystack-is-back-go-virtual-for-relevant-search-talks-workshops-discussions-training/


One part of this is three conferences, Berlin Buzzwords, Haystack and 
MICES, have come together for a week of online talks, workshops, panels 
and discussions. There's lots of great search related content including 
Uwe Schindler on Lucene 9, Doug Turnbull & Trey Grainger on AI-Powered 
Search, Tim Allison of NASA on genetic algorithms, a panel on result 
diversity, a workshop on the opensource ecommerce search ecosystem...do 
check it out at www.berlinbuzzwords.de . I'm running a Lightning Talks 
session too (let me know if you've got a talk).


Cheers

Charlie

--

Charlie Hull
OpenSource Connections, previously Flax

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.o19s.com



This IndexSchema is not mutable. Solr 7.3.1

2020-05-20 Thread Vincenzo D'Amore
Hi all,

I'm trying to import a csv file in solr

id,name,surname,gender,eyeColor,hairColor
1,pippo,pluto,male,brown,brown

I'm using this command

curl '
http://localhost:8983/solr/videoid/update?commit=true=true=id,name,surname,gender,eyeColor,hairColor=,'
-H "Content-Type: application/csv" --data-binary @test.csv

But receiving the following error:

{
  "responseHeader":{
"status":400,
"QTime":32},
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","org.apache.solr.common.SolrException"],
"msg":"This IndexSchema is not mutable.",
"code":400}}

Do you know why the Solr index should be mutable?



-- 
Vincenzo D'Amore


Need help on handling large size of index.

2020-05-20 Thread Modassar Ather
Hi,

Currently we have index of size 3.5 TB. These index are distributed across
12 shards under two cores. The size of index on each shards are almost
equal.
We do a delta indexing every week and optimise the index.

The server configuration is as follows.

   - Solr Version  : 6.5.1
   - AWS instance type : r5a.16xlarge
   - CPU(s)  : 64
   - RAM  : 512GB
   - EBS size  : 7 TB (For indexing as well as index optimisation.)
   - IOPs  : 3 (For faster index optimisation)


Can you please help me with following few questions?

   - What is the ideal index size per shard?
   - The optimisation takes lot of time and IOPs to complete. Will
   increasing the number of shards help in reducing the optimisation time and
   IOPs?
   - We are planning to reduce each shard index size to 30GB and the entire
   3.5 TB index will be distributed across more shards. In this case to almost
   70+ shards. Will this help?
   - Will adding so many new shards increase the search response time and
   possibly how much?
   - If we have to increase the shards should we do it on a single larger
   server or should do it on multiple small servers?


Kindly share your thoughts on how best we can use Solr with such a large
index size.

Best,
Modassar


Use cases for the graph streams

2020-05-20 Thread Nightingale, Jonathan A (US)
This is kind of  broad question, but I was playing with the graph streams and 
was having trouble making the tools work for what I wanted to do. I'm wondering 
if the use case for the graph streams really supports standard graph queries 
you might use with Gemlin or the like? I ask because right now we have two 
implementations of our data storage to support these two ways of looking at it, 
the standard query and the semantic filtering.

The usecases I usually see for the graph streams always seem to be limited to 
one link traversal for finding things related to nodes gathered from a query. 
But even with that it wasn't clear the best way to do things with lists of 
docvalues. So for example if you wanted to represent a node that had many doc 
values I had to use cross products to make a node for each doc value. The 
traversal didn't allow for that kind of node linking inherently it seemed.

So my question really is (and maybe this is not the place for this) what is the 
intent of these graph features and what is the goal for them in the future? I 
was really hoping at one point to only use solr for our product but it didn't 
seem feasible, at least not easily.

Thanks for all your help
Jonathan

Jonathan Nightingale
GXP Solutions Engineer
(office) 315 838 2273
(cell) 315 271 0688



Re: when to use docvalue

2020-05-20 Thread Rahul Goswami
Eric,
Thanks for that explanation. I have a follow up question on that. I find
the scenario of stored=true and docValues=true to be tricky at times...
would like to know when is each of these scenarios preferred over the other
two for primitive datatypes:

1) stored=true and docValues=false
2) stored=false and docValues=true
3) stored=true and docValues=true

Thanks,
Rahul

On Tue, May 19, 2020 at 5:55 PM Erick Erickson 
wrote:

> They are _absolutely_ able to be used together. Background:
>
> “In the bad old days”, there was no docValues. So whenever you needed
> to facet/sort/group/use function queries Solr (well, Lucene) had to take
> the inverted structure resulting from “index=true” and “uninvert” it on the
> Java heap.
>
> docValues essentially does the “uninverting” at index time and puts
> that structure in a separate file for each segment. So rather than uninvert
> the index on the heap, Lucene can just read it in from disk in
> MMapDirectory
> (i.e. OS) memory space.
>
> The downside is that your index will be bigger when you do both, that is
> the
> size on disk will be bigger. But, it’ll be much faster to load, much
> faster to
> autowarm, and will move the structures necessary to do faceting/sorting/etc
> into OS memory where the garbage collection is vastly more efficient than
> Javas.
>
> And frankly I don’t think the increased size on disk is a downside. You’ll
> have
> to have the memory anyway, and having it used on the OS memory space is
> so much more efficient than on Java’s heap that it’s a win-win IMO.
>
> Oh, and if you never sort/facet/group/use function queries, then the
> docValues structures are never even read into MMapDirectory space.
>
> So yes, freely do both.
>
> Best,
> Erick
>
>
> > On May 19, 2020, at 5:41 PM, matthew sporleder 
> wrote:
> >
> > You can index AND docvalue?  For some reason I thought they were
> exclusive
> >
> > On Tue, May 19, 2020 at 5:36 PM Erick Erickson 
> wrote:
> >>
> >> Yes. You should also index them….
> >>
> >> Here’s the way I think of it.
> >>
> >> For questions “For term X, which docs contain that value?” means
> index=true. This is a search.
> >>
> >> For questions “Does doc X have value Y in field Z”, means
> docValues=true.
> >>
> >> what’s the difference? Well, the first one is to get the result set.
> The second is for, given a result set,
> >> count/sort/whatever.
> >>
> >> fq clauses are searches, so index=true.
> >>
> >> sorting, faceting, grouping and function queries  are “for each doc in
> the result set, what values does field Y contain?”
> >>
> >> Maybe that made things clear as mud, but it’s the way I think of it ;)
> >>
> >> Best,
> >> Erick
> >>
> >>
> >>
> >> fq clauses are searches. Indexed=true is for searching.
> >>
> >> sort
> >>
> >>> On May 19, 2020, at 4:00 PM, matthew sporleder 
> wrote:
> >>>
> >>> I have quite a few numeric / meta-data type fields in my schema and
> >>> pretty much only use them in fq=, sort=, and friends.  Should I always
> >>> use DocValue on these if i never plan to q=search: on them?  Are there
> >>> any drawbacks?
> >>>
> >>> Thanks,
> >>> Matt
> >>
>
>


Re: when to use docvalue

2020-05-20 Thread Revas
Erick, Can you also explain how to optimize facet query and range facets as
they dont use docValues and contribute to higher response time?

On Tue, May 19, 2020 at 5:55 PM Erick Erickson 
wrote:

> They are _absolutely_ able to be used together. Background:
>
> “In the bad old days”, there was no docValues. So whenever you needed
> to facet/sort/group/use function queries Solr (well, Lucene) had to take
> the inverted structure resulting from “index=true” and “uninvert” it on the
> Java heap.
>
> docValues essentially does the “uninverting” at index time and puts
> that structure in a separate file for each segment. So rather than uninvert
> the index on the heap, Lucene can just read it in from disk in
> MMapDirectory
> (i.e. OS) memory space.
>
> The downside is that your index will be bigger when you do both, that is
> the
> size on disk will be bigger. But, it’ll be much faster to load, much
> faster to
> autowarm, and will move the structures necessary to do faceting/sorting/etc
> into OS memory where the garbage collection is vastly more efficient than
> Javas.
>
> And frankly I don’t think the increased size on disk is a downside. You’ll
> have
> to have the memory anyway, and having it used on the OS memory space is
> so much more efficient than on Java’s heap that it’s a win-win IMO.
>
> Oh, and if you never sort/facet/group/use function queries, then the
> docValues structures are never even read into MMapDirectory space.
>
> So yes, freely do both.
>
> Best,
> Erick
>
>
> > On May 19, 2020, at 5:41 PM, matthew sporleder 
> wrote:
> >
> > You can index AND docvalue?  For some reason I thought they were
> exclusive
> >
> > On Tue, May 19, 2020 at 5:36 PM Erick Erickson 
> wrote:
> >>
> >> Yes. You should also index them….
> >>
> >> Here’s the way I think of it.
> >>
> >> For questions “For term X, which docs contain that value?” means
> index=true. This is a search.
> >>
> >> For questions “Does doc X have value Y in field Z”, means
> docValues=true.
> >>
> >> what’s the difference? Well, the first one is to get the result set.
> The second is for, given a result set,
> >> count/sort/whatever.
> >>
> >> fq clauses are searches, so index=true.
> >>
> >> sorting, faceting, grouping and function queries  are “for each doc in
> the result set, what values does field Y contain?”
> >>
> >> Maybe that made things clear as mud, but it’s the way I think of it ;)
> >>
> >> Best,
> >> Erick
> >>
> >>
> >>
> >> fq clauses are searches. Indexed=true is for searching.
> >>
> >> sort
> >>
> >>> On May 19, 2020, at 4:00 PM, matthew sporleder 
> wrote:
> >>>
> >>> I have quite a few numeric / meta-data type fields in my schema and
> >>> pretty much only use them in fq=, sort=, and friends.  Should I always
> >>> use DocValue on these if i never plan to q=search: on them?  Are there
> >>> any drawbacks?
> >>>
> >>> Thanks,
> >>> Matt
> >>
>
>


Query takes more time in Solr 8.5.1 compare to 6.1.0 version

2020-05-20 Thread jay harkhani
Hello,

Currently I upgrade Solr version from 6.1.0 to 8.5.1 and come across one issue. 
Query which have more ids (around 3000) and grouping is applied takes more time 
to execute. In Solr 6.1.0 it takes 677ms and in Solr 8.5.1 it takes 26090ms. 
While take reading we have same solr schema and same no. of records in both 
solr version.

Please refer below details for query, logs and thead dump (generate from Solr 
Admin while execute query).

Query : https://drive.google.com/file/d/1bavCqwHfJxoKHFzdOEt-mSG8N0fCHE-w/view

Logs and Thread dump stack trace
Solr 8.5.1 : 
https://drive.google.com/file/d/149IgaMdLomTjkngKHrwd80OSEa1eJbBF/view
Solr 6.1.0 : 
https://drive.google.com/file/d/13v1u__fM8nHfyvA0Mnj30IhdffW6xhwQ/view

To analyse further more we found that if we remove grouping field or we reduce 
no. of ids from query it execute fast. Is anything change in 8.5.1 version 
compare to 6.1.0 as in 6.1.0 even for large no. Ids along with grouping it 
works faster?

Can someone please help to isolate this issue.

Regards,
Jay Harkhani.


Re: Different indexing times for two different collections with different data sizes

2020-05-20 Thread Erick Erickson
The easy question first. There is an absolute limit of 2B docs per shard. 
Internally, Lucene assigns an integer internal document ID that overflows after 
2B. That includes deleted docs, so your “maxDoc” on the admin page is the 
limit. Practically, as you are finding, you run into performance issues at 
significantly than 2B. Note that when segments are merged, the internal IDs get 
reassigned...

Indexing scales pretty linearly with the number of shards, _assuming_ you’re 
adding more hardware. To really answer the question you need to look at what 
the bottleneck is on your current system. IOW, “It Depends(tm)”.

Let’s claim your current system is running all your CPUs flat out. Or I/O is 
maxed out. Adding more shards to the existing hardware won’t help. Perhaps you 
don’t even need more shards, you just need to move some of your replicas to new 
hardware.

OTOH, let’s claim that your indexing isn’t straining your current hardware at 
all, then adding more shards to existing hardware should increase throughput.

Probably the issue is merging. When segments are merged, they’re re-written. My 
guess is that your larger collection is doing more merging than your test 
collection, but that’s a guess. See Mike McCandless’ blog, TieredMergePolicy is 
the default you’re probably using: 
http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html

Best,
Erick

> On May 20, 2020, at 7:25 AM, Kommu, Vinodh K.  wrote:
> 
> Hi,
> 
> Recently we had noticed that one of the largest collection (shards = 6 ; 
> replication factor =3) which holds up to 1TB of data & nearly 3.2 billion of 
> docs is taking longer time to index than it used to before. To see the 
> indexing time difference, we created another collection using largest 
> collection configs (schema.xml and solrconfig.xml files) and loaded the 
> collection with up to 100 million docs which is ~60G of data. Later we tried 
> to index exactly same 25 million docs data file on these two collections 
> which clearly showed timing difference. BTW, we are running on Solr 7.7.1 
> version.
> 
> Original largest collection has completed indexing in ~100mins
> Newly created collection (which has 100 million docs) has completed in ~70mins
> 
> This indexing time difference is due to the amount of data that each 
> collection hold? If yes, how to increase indexing performance on larger data 
> collection? adding more shards can help here?
> 
> Also, is there any threshold numbers for a single shard can hold in terms of 
> size and number of docs before adding a new shard?
> 
> Any answers would really help!!
> 
> 
> Thanks & Regards,
> Vinodh
> 
> DTCC DISCLAIMER: This email and any files transmitted with it are 
> confidential and intended solely for the use of the individual or entity to 
> whom they are addressed. If you have received this email in error, please 
> notify us immediately and delete the email and any attachments from your 
> system. The recipient should check this email and any attachments for the 
> presence of viruses. The company accepts no liability for any damage caused 
> by any virus transmitted by this email.



MIGRATE without split.key?

2020-05-20 Thread YangLiu
Hello everyone,


I want to migrate data from one collection to another with MIGRATE API, but if 
this parameter split.key is not specified, it cannot be executed. 


Why can't we remove this limitation? Is there a better way to migrate data?


Thanks.

Different indexing times for two different collections with different data sizes

2020-05-20 Thread Kommu, Vinodh K.
Hi,

Recently we had noticed that one of the largest collection (shards = 6 ; 
replication factor =3) which holds up to 1TB of data & nearly 3.2 billion of 
docs is taking longer time to index than it used to before. To see the indexing 
time difference, we created another collection using largest collection configs 
(schema.xml and solrconfig.xml files) and loaded the collection with up to 100 
million docs which is ~60G of data. Later we tried to index exactly same 25 
million docs data file on these two collections which clearly showed timing 
difference. BTW, we are running on Solr 7.7.1 version.

Original largest collection has completed indexing in ~100mins
Newly created collection (which has 100 million docs) has completed in ~70mins

This indexing time difference is due to the amount of data that each collection 
hold? If yes, how to increase indexing performance on larger data collection? 
adding more shards can help here?

Also, is there any threshold numbers for a single shard can hold in terms of 
size and number of docs before adding a new shard?

Any answers would really help!!


Thanks & Regards,
Vinodh

DTCC DISCLAIMER: This email and any files transmitted with it are confidential 
and intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error, please notify us 
immediately and delete the email and any attachments from your system. The 
recipient should check this email and any attachments for the presence of 
viruses. The company accepts no liability for any damage caused by any virus 
transmitted by this email.


Re: REINDEXCOLLECTION not working on an alias

2020-05-20 Thread Bjarke Buur Mortensen
OK, that makes sense.
 Looking forward to that fix, thanks for the reply.

Den tir. 19. maj 2020 kl. 17.21 skrev Joel Bernstein :

> I believe the issue is that under the covers this feature is using the
> "topic" streaming expressions which it was just reported doesn't work with
> aliases. This is something that will get fixed, but for the current release
> there isn't a workaround for this issue.
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Tue, May 19, 2020 at 8:25 AM Bjarke Buur Mortensen <
> morten...@eluence.com>
> wrote:
>
> > Hi list,
> >
> > I seem to be unable to get REINDEXCOLLECTION to work on a collection
> alias
> > (running Solr 8.2.0). The documentation seems to state that that should
> be
> > possible:
> >
> >
> https://lucene.apache.org/solr/guide/8_2/collection-management.html#reindexcollection
> > "name
> > Source collection name, may be an alias. This parameter is required."
> >
> > If I run on my alias (qa_supplier_products):
> > curl "
> >
> >
> http://localhost:8983/solr/admin/collections?action=REINDEXCOLLECTION=qa_supplier_products=1=start
> > I get an error:
> > "org.apache.solr.common.SolrException: Unable to copy documents from
> > qa_supplier_products to .rx_qa_supplier_products_6:
> > {\"result-set\":{\"docs\":[\n
> >  {\"DaemonOp\":\"Deamon:.rx_qa_supplier_products_6 started on
> > .rx_qa_supplier_products_0_shard1_replica_n1\"
> >
> > If I instead point to the underlying collection, everything works fine.
> Now
> > I have an alias pointing to an alias, which works, but ideally I would
> like
> > to just have my main alias point to the newly reindexed collection.
> >
> > Can anybody help me out here?
> >
> > Thanks,
> > /Bjarke
> >
>