Re: unique key accross collections within datacenter

2020-05-13 Thread ART GALLERY
check out the videos on this website TROO.TUBE don't be such a
sheep/zombie/loser/NPC. Much love!
https://troo.tube/videos/watch/aaa64864-52ee-4201-922f-41300032f219

On Wed, May 13, 2020 at 7:24 AM Bernd Fehling
 wrote:
>
> Thanks Eric for your answer.
>
> I was thinking to complex and seeing problems which are not there.
>
> I have your second scenario. The first huge collection still remains
> and will grow further while the second will start with same schema but
> content from a new source. Sure I could also load the content
> from the new source into the first huge collection but I want to
> have source, loading, maintenance handling separated.
> May be I also start the new collection with a new instance.
>
> Regards
> Bernd
>
> Am 13.05.20 um 13:40 schrieb Erick Erickson:
> > So a doc in your new collection is expected to supersede a doc
> > with the same ID in the old one, right?
> >
> > What I’d do is delete the IDs from my old collection as they were added to
> > the new one, there’s not much use in keeping both if you always want
> > the new one.
> >
> > Let’s assume you do this, the next issue is making sure all of your docs in
> > the new collection are deleted from the old one, and your process will
> > inevitably have a hiccough or two. You could periodically use streaming to
> > produce a list of IDs common to both collections, and have a cleanup
> > process you occasionally ran to make up for any glitches in the normal
> > delete-from-the-old-collection process, see:
> > https://lucene.apache.org/solr/guide/6_6/stream-decorators.html#stream-decorators
> >
> > If that’s not the case, then having the same id in the different collections
> > doesn’t matter. Solr doesn’t use the ID for combining results, just routing 
> > and
> > then updating.
> >
> > This is illustrated by the fact that, through user error, you can even get 
> > the same
> > document repeated in a result set if it gets indexed to two different 
> > shards.
> >
> > And if neither of those are on target, what about “handling” unique IDs 
> > across
> > two collections do you think might go wrong?
> >
> > Best,
> > Erick
> >
> >> On May 13, 2020, at 4:26 AM, Bernd Fehling 
> >>  wrote:
> >>
> >> Dear list,
> >>
> >> in my SolrCloud 6.6 I have a huge collection and now I will get
> >> much more data from a different source to be indexed.
> >> So I'm thinking about a new collection and combine both, the existing
> >> one and the new one with an alias.
> >>
> >> But how to handle the unique key accross collections within a datacenter?
> >> Is it at all possible?
> >>
> >> I don't see any problems with add, update and delete of documents because
> >> these operations are not using the alias.
> >>
> >> But searching accross collections with alias and then fetching documents
> >> by id from the result may lead to results where the id is in both 
> >> collections?
> >>
> >> I have no idea, but there are SolrClouds with a lot of collections out 
> >> there.
> >> How do they handle uniqueness accross collections within a datacenter?
> >>
> >> Regards
> >> Bernd
> >


Re: unique key accross collections within datacenter

2020-05-13 Thread Bernd Fehling
Thanks Eric for your answer.

I was thinking to complex and seeing problems which are not there.

I have your second scenario. The first huge collection still remains
and will grow further while the second will start with same schema but
content from a new source. Sure I could also load the content
from the new source into the first huge collection but I want to
have source, loading, maintenance handling separated.
May be I also start the new collection with a new instance.

Regards
Bernd

Am 13.05.20 um 13:40 schrieb Erick Erickson:
> So a doc in your new collection is expected to supersede a doc
> with the same ID in the old one, right? 
> 
> What I’d do is delete the IDs from my old collection as they were added to
> the new one, there’s not much use in keeping both if you always want
> the new one.
> 
> Let’s assume you do this, the next issue is making sure all of your docs in 
> the new collection are deleted from the old one, and your process will
> inevitably have a hiccough or two. You could periodically use streaming to 
> produce a list of IDs common to both collections, and have a cleanup
> process you occasionally ran to make up for any glitches in the normal
> delete-from-the-old-collection process, see:
> https://lucene.apache.org/solr/guide/6_6/stream-decorators.html#stream-decorators
> 
> If that’s not the case, then having the same id in the different collections
> doesn’t matter. Solr doesn’t use the ID for combining results, just routing 
> and
> then updating.
> 
> This is illustrated by the fact that, through user error, you can even get 
> the same
> document repeated in a result set if it gets indexed to two different shards.
> 
> And if neither of those are on target, what about “handling” unique IDs across
> two collections do you think might go wrong?
> 
> Best,
> Erick
> 
>> On May 13, 2020, at 4:26 AM, Bernd Fehling  
>> wrote:
>>
>> Dear list,
>>
>> in my SolrCloud 6.6 I have a huge collection and now I will get
>> much more data from a different source to be indexed.
>> So I'm thinking about a new collection and combine both, the existing
>> one and the new one with an alias.
>>
>> But how to handle the unique key accross collections within a datacenter?
>> Is it at all possible?
>>
>> I don't see any problems with add, update and delete of documents because
>> these operations are not using the alias.
>>
>> But searching accross collections with alias and then fetching documents
>> by id from the result may lead to results where the id is in both 
>> collections?
>>
>> I have no idea, but there are SolrClouds with a lot of collections out there.
>> How do they handle uniqueness accross collections within a datacenter?
>>
>> Regards
>> Bernd
> 


Re: unique key accross collections within datacenter

2020-05-13 Thread Erick Erickson
So a doc in your new collection is expected to supersede a doc
with the same ID in the old one, right? 

What I’d do is delete the IDs from my old collection as they were added to
the new one, there’s not much use in keeping both if you always want
the new one.

Let’s assume you do this, the next issue is making sure all of your docs in 
the new collection are deleted from the old one, and your process will
inevitably have a hiccough or two. You could periodically use streaming to 
produce a list of IDs common to both collections, and have a cleanup
process you occasionally ran to make up for any glitches in the normal
delete-from-the-old-collection process, see:
https://lucene.apache.org/solr/guide/6_6/stream-decorators.html#stream-decorators

If that’s not the case, then having the same id in the different collections
doesn’t matter. Solr doesn’t use the ID for combining results, just routing and
then updating.

This is illustrated by the fact that, through user error, you can even get the 
same
document repeated in a result set if it gets indexed to two different shards.

And if neither of those are on target, what about “handling” unique IDs across
two collections do you think might go wrong?

Best,
Erick

> On May 13, 2020, at 4:26 AM, Bernd Fehling  
> wrote:
> 
> Dear list,
> 
> in my SolrCloud 6.6 I have a huge collection and now I will get
> much more data from a different source to be indexed.
> So I'm thinking about a new collection and combine both, the existing
> one and the new one with an alias.
> 
> But how to handle the unique key accross collections within a datacenter?
> Is it at all possible?
> 
> I don't see any problems with add, update and delete of documents because
> these operations are not using the alias.
> 
> But searching accross collections with alias and then fetching documents
> by id from the result may lead to results where the id is in both collections?
> 
> I have no idea, but there are SolrClouds with a lot of collections out there.
> How do they handle uniqueness accross collections within a datacenter?
> 
> Regards
> Bernd



unique key accross collections within datacenter

2020-05-13 Thread Bernd Fehling
Dear list,

in my SolrCloud 6.6 I have a huge collection and now I will get
much more data from a different source to be indexed.
So I'm thinking about a new collection and combine both, the existing
one and the new one with an alias.

But how to handle the unique key accross collections within a datacenter?
Is it at all possible?

I don't see any problems with add, update and delete of documents because
these operations are not using the alias.

But searching accross collections with alias and then fetching documents
by id from the result may lead to results where the id is in both collections?

I have no idea, but there are SolrClouds with a lot of collections out there.
How do they handle uniqueness accross collections within a datacenter?

Regards
Bernd


Long value as unique key in long term

2018-11-22 Thread Jaroslaw Rozanski

Hi all,

This is interesting (to me):

1. The "TrieLongField" is deprecated (still there in 7.5, but marked
   deprecated)
2. The "LongPointField" is not allowed as uniqueKey

According to Solr Ref Guide 7.5 
(http://lucene.apache.org/solr/guide/7_5/field-types-included-with-solr.html) 
these are the only option for long types.


Short of mapping as "string", is there a long term solution for those 
who want/need to store unique ID as long?



For reference: https://issues.apache.org/jira/browse/SOLR-10829


Thanks,
Jarek

--
Jaroslaw Rozanski | e: m...@jarekrozanski.eu



Re: solr cloud unique key query request is sent to all shards!

2018-02-19 Thread Ganesh Sethuraman
This works !. Both V1 and V2 version of the real time get works fine.
Just an added note, the performance (response time) also improved.

Thanks Tom

On Mon, Feb 19, 2018 at 1:17 AM, Tomas Fernandez Lobbe 
wrote:

> In real-time get, the parameter name is “id”, regardless of the name of
> the unique key.
>
> The request should be in your case: http://:8080/api/
> collections/col1/get?id=69749398
>
> See: https://lucene.apache.org/solr/guide/7_2/realtime-get.html
>
> Sent from my iPhone
>
> > On Feb 18, 2018, at 9:28 PM, Ganesh Sethuraman 
> wrote:
> >
> > I tried this real time get on my collection using the both V1 and V2 URL
> > for real time get, but did not work!!!
> >
> > http://:8080/api/collections/col1/get?myid:69749398
> >
> > it returned...
> >
> > {
> >  "doc":null}
> >
> > same issue with V1 URL as well, http://
> > :8080/solr/col1/get?myid:69749398
> >
> > however if i do q=myid:69749398 with "select" request handler seems to
> > fine. I checked my schema again and it is configured correctly.  Like
> below:
> >
> > myid
> >
> > Also i see that this implicit request handler is configured correctly Any
> > thoughts, what I might be missing?
> >
> >
> >
> > On Sun, Feb 18, 2018 at 11:18 PM, Tomas Fernandez Lobbe <
> tflo...@apple.com>
> > wrote:
> >
> >> I think real-time get should be directed to the correct shard. Try:
> >> [COLLECTION]/get?id=[YOUR_ID]
> >>
> >> Sent from my iPhone
> >>
> >>> On Feb 18, 2018, at 3:17 PM, Ganesh Sethuraman <
> ganeshmail...@gmail.com>
> >> wrote:
> >>>
> >>> Hi
> >>>
> >>> I am using Solr 7.2.1. I have 8 shards in two nodes (two different m/c)
> >>> using Solr Cloud. The data was indexed with a unique key (default
> >> composite
> >>> id) using the CSV update handler (batch indexing). Note that I do NOT
> >> have
> >>>  while indexing.   Then when I try to  query the
> >>> collection col1 based on my primary key (as below), I see that in the
> >>> 'debug' response that the query was sent to all the shards and when it
> >>> finds the document in one the shards it sends a GET FIELD to that shard
> >> to
> >>> get the data.  The problem is potentially high response time, and more
> >>> importantly scalability issue as unnecessarily all shards are being
> >> queried
> >>> to get one document (by unique key).
> >>>
> >>> http://:8080/solr/col1/select?debug=true&q=id:69749278
> >>>
> >>> Is there a way to query to reach the right shard based on the has of
> the
> >>> unique key?
> >>>
> >>> Regards
> >>> Ganesh
> >>
>


Re: solr cloud unique key query request is sent to all shards!

2018-02-18 Thread Tomas Fernandez Lobbe
In real-time get, the parameter name is “id”, regardless of the name of the 
unique key. 

The request should be in your case: 
http://:8080/api/collections/col1/get?id=69749398

See: https://lucene.apache.org/solr/guide/7_2/realtime-get.html

Sent from my iPhone

> On Feb 18, 2018, at 9:28 PM, Ganesh Sethuraman  
> wrote:
> 
> I tried this real time get on my collection using the both V1 and V2 URL
> for real time get, but did not work!!!
> 
> http://:8080/api/collections/col1/get?myid:69749398
> 
> it returned...
> 
> {
>  "doc":null}
> 
> same issue with V1 URL as well, http://
> :8080/solr/col1/get?myid:69749398
> 
> however if i do q=myid:69749398 with "select" request handler seems to
> fine. I checked my schema again and it is configured correctly.  Like below:
> 
> myid
> 
> Also i see that this implicit request handler is configured correctly Any
> thoughts, what I might be missing?
> 
> 
> 
> On Sun, Feb 18, 2018 at 11:18 PM, Tomas Fernandez Lobbe 
> wrote:
> 
>> I think real-time get should be directed to the correct shard. Try:
>> [COLLECTION]/get?id=[YOUR_ID]
>> 
>> Sent from my iPhone
>> 
>>> On Feb 18, 2018, at 3:17 PM, Ganesh Sethuraman 
>> wrote:
>>> 
>>> Hi
>>> 
>>> I am using Solr 7.2.1. I have 8 shards in two nodes (two different m/c)
>>> using Solr Cloud. The data was indexed with a unique key (default
>> composite
>>> id) using the CSV update handler (batch indexing). Note that I do NOT
>> have
>>>  while indexing.   Then when I try to  query the
>>> collection col1 based on my primary key (as below), I see that in the
>>> 'debug' response that the query was sent to all the shards and when it
>>> finds the document in one the shards it sends a GET FIELD to that shard
>> to
>>> get the data.  The problem is potentially high response time, and more
>>> importantly scalability issue as unnecessarily all shards are being
>> queried
>>> to get one document (by unique key).
>>> 
>>> http://:8080/solr/col1/select?debug=true&q=id:69749278
>>> 
>>> Is there a way to query to reach the right shard based on the has of the
>>> unique key?
>>> 
>>> Regards
>>> Ganesh
>> 


Re: solr cloud unique key query request is sent to all shards!

2018-02-18 Thread Ganesh Sethuraman
I tried this real time get on my collection using the both V1 and V2 URL
for real time get, but did not work!!!

http://:8080/api/collections/col1/get?myid:69749398

it returned...

{
  "doc":null}

same issue with V1 URL as well, http://
:8080/solr/col1/get?myid:69749398

however if i do q=myid:69749398 with "select" request handler seems to
fine. I checked my schema again and it is configured correctly.  Like below:

myid

Also i see that this implicit request handler is configured correctly Any
thoughts, what I might be missing?



On Sun, Feb 18, 2018 at 11:18 PM, Tomas Fernandez Lobbe 
wrote:

> I think real-time get should be directed to the correct shard. Try:
> [COLLECTION]/get?id=[YOUR_ID]
>
> Sent from my iPhone
>
> > On Feb 18, 2018, at 3:17 PM, Ganesh Sethuraman 
> wrote:
> >
> > Hi
> >
> > I am using Solr 7.2.1. I have 8 shards in two nodes (two different m/c)
> > using Solr Cloud. The data was indexed with a unique key (default
> composite
> > id) using the CSV update handler (batch indexing). Note that I do NOT
> have
> >  while indexing.   Then when I try to  query the
> > collection col1 based on my primary key (as below), I see that in the
> > 'debug' response that the query was sent to all the shards and when it
> > finds the document in one the shards it sends a GET FIELD to that shard
> to
> > get the data.  The problem is potentially high response time, and more
> > importantly scalability issue as unnecessarily all shards are being
> queried
> > to get one document (by unique key).
> >
> > http://:8080/solr/col1/select?debug=true&q=id:69749278
> >
> > Is there a way to query to reach the right shard based on the has of the
> > unique key?
> >
> > Regards
> > Ganesh
>


Re: solr cloud unique key query request is sent to all shards!

2018-02-18 Thread Tomas Fernandez Lobbe
I think real-time get should be directed to the correct shard. Try:  
[COLLECTION]/get?id=[YOUR_ID]

Sent from my iPhone

> On Feb 18, 2018, at 3:17 PM, Ganesh Sethuraman  
> wrote:
> 
> Hi
> 
> I am using Solr 7.2.1. I have 8 shards in two nodes (two different m/c)
> using Solr Cloud. The data was indexed with a unique key (default composite
> id) using the CSV update handler (batch indexing). Note that I do NOT have
>  while indexing.   Then when I try to  query the
> collection col1 based on my primary key (as below), I see that in the
> 'debug' response that the query was sent to all the shards and when it
> finds the document in one the shards it sends a GET FIELD to that shard to
> get the data.  The problem is potentially high response time, and more
> importantly scalability issue as unnecessarily all shards are being queried
> to get one document (by unique key).
> 
> http://:8080/solr/col1/select?debug=true&q=id:69749278
> 
> Is there a way to query to reach the right shard based on the has of the
> unique key?
> 
> Regards
> Ganesh


solr cloud unique key query request is sent to all shards!

2018-02-18 Thread Ganesh Sethuraman
Hi

I am using Solr 7.2.1. I have 8 shards in two nodes (two different m/c)
using Solr Cloud. The data was indexed with a unique key (default composite
id) using the CSV update handler (batch indexing). Note that I do NOT have
 while indexing.   Then when I try to  query the
collection col1 based on my primary key (as below), I see that in the
'debug' response that the query was sent to all the shards and when it
finds the document in one the shards it sends a GET FIELD to that shard to
get the data.  The problem is potentially high response time, and more
importantly scalability issue as unnecessarily all shards are being queried
to get one document (by unique key).

http://:8080/solr/col1/select?debug=true&q=id:69749278

Is there a way to query to reach the right shard based on the has of the
unique key?

Regards
Ganesh


Re: cursor with sort value along with unique key

2017-06-15 Thread Mikhail Khludnev
Hello,
http://lucene.472066.n3.nabble.com/Pagination-bug-when-sorting-by-a-field-not-unique-field-tp4327408p4327524.html
might be relevant.

On Thu, Jun 15, 2017 at 12:40 PM, Preeti Chhabra <
preeti.chha...@karexpert.com> wrote:

> Hello,
>
> With respects to cursors, when using a computed sort value (like score)
> in combination with the unique field sort (score desc, id asc) seems to
> cause some wildly inconsistent and incomplete results.  Can I get any help
> out of it?
>
>
> Thanks & Regards
>
> Preeti chhabra
>
>


-- 
Sincerely yours
Mikhail Khludnev


cursor with sort value along with unique key

2017-06-15 Thread Preeti Chhabra

Hello,

With respects to cursors, when using a computed sort value (like score)  
in combination with the unique field sort (score desc, id asc) seems to 
cause some wildly inconsistent and incomplete results.  Can I get any 
help out of it?



Thanks & Regards

Preeti chhabra



Re: Schema API: Modify Unique Key

2017-03-28 Thread nabil Kouici
Thank you for your replay.
We would like to have this functionality in order to change unique key to do a 
partial update.Partial update cannot work without a unique key and our need is 
to do like in SQL (update documents set documents.field1=wyz where 
documents.field2 = xxx). So we put field2 as unique key (uniqueness is OK) and 
do this kind of update.
Regards,Nabil. 

  De : Shawn Heisey 
 À : solr-user@lucene.apache.org 
 Envoyé le : Lundi 27 mars 2017 17h00
 Objet : Re: Schema API: Modify Unique Key
   
On 3/27/2017 7:05 AM, nabil Kouici wrote:
> We're going to use Solr in our organization (under test) and we want
> to set the primary key through schema API, which is not allowed today.
> Is this function planned to be implemented in Solr? If yes, do you
> have any idea in which version? 

Steve Rowe has been working on it, as he mentioned.  I have asked him a
question via the SOLR-7242 issue.

I can think of two reasons that this functionality has NOT been written yet:

1) In Cloud mode on a distributed index, it is unlikely that the
existing collection will have the documents in the correct shards. 
Entirely reindexing is strongly recommended in these situations.

2) Before changing the uniqueKey, you must be absolutely certain that
the field is the appropriate type and that the field does not contain
the same value more than once.  If this is not the case, Solr will not
behave correctly.

Thanks,
Shawn



   

Re: Schema API: Modify Unique Key

2017-03-27 Thread Shawn Heisey
On 3/27/2017 7:05 AM, nabil Kouici wrote:
> We're going to use Solr in our organization (under test) and we want
> to set the primary key through schema API, which is not allowed today.
> Is this function planned to be implemented in Solr? If yes, do you
> have any idea in which version? 

Steve Rowe has been working on it, as he mentioned.  I have asked him a
question via the SOLR-7242 issue.

I can think of two reasons that this functionality has NOT been written yet:

1) In Cloud mode on a distributed index, it is unlikely that the
existing collection will have the documents in the correct shards. 
Entirely reindexing is strongly recommended in these situations.

2) Before changing the uniqueKey, you must be absolutely certain that
the field is the appropriate type and that the field does not contain
the same value more than once.  If this is not the case, Solr will not
behave correctly.

Thanks,
Shawn



Re: Schema API: Modify Unique Key

2017-03-27 Thread Steve Rowe
Hi Nabil,

There is an open JIRA issue to implement this functionality, but I haven’t had 
a chance to work on it recently: 
.  Consequently, I’m not sure 
which release will have it.

Patches welcome!

--
Steve
www.lucidworks.com

> On Mar 27, 2017, at 9:05 AM, nabil Kouici  wrote:
> 
> Hi All,
> 
> 
> 
> We're going to use Solr in our organization (under test) and we want to set 
> the primary key through schema API, which is not allowed today. Is this 
> function planned to be implemented in Solr? If yes, do you have any idea in 
> which version?
> Regards,Nabil.   
> 



Schema API: Modify Unique Key

2017-03-27 Thread nabil Kouici
Hi All,



We're going to use Solr in our organization (under test) and we want to set the 
primary key through schema API, which is not allowed today. Is this function 
planned to be implemented in Solr? If yes, do you have any idea in which 
version?
Regards,Nabil.   

   

Re: Auto-generate unique key when adding documents from SolrJ

2017-02-26 Thread OTH
Thanks, great, it's working now!
Omer

On Sun, Feb 26, 2017 at 8:24 PM, Alexandre Rafalovitch 
wrote:

> It is not enough to declare URP chain, you have to invoke it.
>
> Either by marking it default or by adding the update.chain parameter
> to the request handler (or in initParams) you use to update the
> documents (usually /update). See, for example:
> https://github.com/apache/lucene-solr/blob/master/solr/
> server/solr/configsets/data_driven_schema_configs/conf/solrconfig.xml#L837
>
> Regards,
>Alex.
> 
> http://www.solr-start.com/ - Resources for Solr users, new and experienced
>
>
> On 26 February 2017 at 10:11, OTH  wrote:
> > Hello all,
> >
> > First of all, I am very new to Solr.
> >
> > I am using Solr version 6.4.1.  I have a Solr core (non-cloud), where
> there
> > is a mandatory unique key field called "id".
> >
> > I am trying to add documents to the core from Java, without having to
> > specify the "id" field explicitly; i.e. to have it auto-generated.
> >
> > I learned that this is possible by including the following information in
> > the conf/solrconfig.xml file:
> >
> >> 
> >> 
> >> 
> >> id
> >>   
> >> ...
> >> 
> >> 
> >> 
> >>   
> >
> >
> > (I did restart the server after adding the above text to the xml file.)
> >
> > However, when I try to add documents from Java using SolrJ (without
> > specifying the "id" field), I get the following exception:
> >
> >> Exception in thread "main"
> >> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
> Error
> >> from server at http://localhost:8983/solr/sales_history: Document is
> >> missing mandatory uniqueKey field: id
> >
> >
> > My Java code is like this:
> >
> >> SolrClient solr = new HttpSolrClient.Builder(SOLR_URL).build();
> >> SolrInputDocument document = new SolrInputDocument();
> >> document.addField(..., ...);
> >> document.addField(..., ...);
> >> UpdateResponse updateResponse = solr.add(document);
> >
> >
> > The exception is thrown from the last line above.
> >
> > Is there any way to add documents from Java and have the uniqueKey field
> be
> > auto-generated?
> >
> >
> > Thank you
>


Re: Auto-generate unique key when adding documents from SolrJ

2017-02-26 Thread Alexandre Rafalovitch
It is not enough to declare URP chain, you have to invoke it.

Either by marking it default or by adding the update.chain parameter
to the request handler (or in initParams) you use to update the
documents (usually /update). See, for example:
https://github.com/apache/lucene-solr/blob/master/solr/server/solr/configsets/data_driven_schema_configs/conf/solrconfig.xml#L837

Regards,
   Alex.

http://www.solr-start.com/ - Resources for Solr users, new and experienced


On 26 February 2017 at 10:11, OTH  wrote:
> Hello all,
>
> First of all, I am very new to Solr.
>
> I am using Solr version 6.4.1.  I have a Solr core (non-cloud), where there
> is a mandatory unique key field called "id".
>
> I am trying to add documents to the core from Java, without having to
> specify the "id" field explicitly; i.e. to have it auto-generated.
>
> I learned that this is possible by including the following information in
> the conf/solrconfig.xml file:
>
>> 
>> 
>> 
>> id
>>   
>> ...
>> 
>> 
>> 
>>   
>
>
> (I did restart the server after adding the above text to the xml file.)
>
> However, when I try to add documents from Java using SolrJ (without
> specifying the "id" field), I get the following exception:
>
>> Exception in thread "main"
>> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
>> from server at http://localhost:8983/solr/sales_history: Document is
>> missing mandatory uniqueKey field: id
>
>
> My Java code is like this:
>
>> SolrClient solr = new HttpSolrClient.Builder(SOLR_URL).build();
>> SolrInputDocument document = new SolrInputDocument();
>> document.addField(..., ...);
>> document.addField(..., ...);
>> UpdateResponse updateResponse = solr.add(document);
>
>
> The exception is thrown from the last line above.
>
> Is there any way to add documents from Java and have the uniqueKey field be
> auto-generated?
>
>
> Thank you


Auto-generate unique key when adding documents from SolrJ

2017-02-26 Thread OTH
Hello all,

First of all, I am very new to Solr.

I am using Solr version 6.4.1.  I have a Solr core (non-cloud), where there
is a mandatory unique key field called "id".

I am trying to add documents to the core from Java, without having to
specify the "id" field explicitly; i.e. to have it auto-generated.

I learned that this is possible by including the following information in
the conf/solrconfig.xml file:

> 
> 
> 
> id
>   
> ...
> 
> 
> 
>   


(I did restart the server after adding the above text to the xml file.)

However, when I try to add documents from Java using SolrJ (without
specifying the "id" field), I get the following exception:

> Exception in thread "main"
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> from server at http://localhost:8983/solr/sales_history: Document is
> missing mandatory uniqueKey field: id


My Java code is like this:

> SolrClient solr = new HttpSolrClient.Builder(SOLR_URL).build();
> SolrInputDocument document = new SolrInputDocument();
> document.addField(..., ...);
> document.addField(..., ...);
> UpdateResponse updateResponse = solr.add(document);


The exception is thrown from the last line above.

Is there any way to add documents from Java and have the uniqueKey field be
auto-generated?


Thank you


Re: Unique key field type in solr 6.1 schema

2016-08-09 Thread Bharath Kumar
ith cross data center replication, when we delete the
> > document by id from the main site. The target site document is not
> deleted.
> > I have the id field which is a unique field for my schema which is
> > configured as "long".
> >
> > If i change the type to "string" it works fine. Is there any issue using
> > long. Because we migrated from 4.4 to 6.1, and we had the id field as
> long.
> > Can you please help me with this. Really appreciate your help.
> >
> > I see the below error on the target site:-
> >
> >  o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException:
> Invalid
> > Number:
> >   at org.apache.solr.schema.TrieField.readableToIndexed(
> > TrieField.java:537)
> > at
> > org.apache.solr.update.DeleteUpdateCommand.getIndexedId(
> > DeleteUpdateCommand.java:65)
> > at
> > org.apache.solr.update.processor.DistributedUpdateProcessor.
> versionDelete(
> > DistributedUpdateProcessor.java:1495)
> > at
> > org.apache.solr.update.processor.CdcrUpdateProcessor.versionDelete(
> > CdcrUpdateProcessor.java:85)
> >
> > Thanks,
> > Bharath Kumar
> >
> >
> >
> > --
> > View this message in context: http://lucene.472066.n3.
> > nabble.com/Unique-key-field-type-in-solr-6-1-schema-tp4290895.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>



-- 
Thanks & Regards,
Bharath MV Kumar

"Life is short, enjoy every moment of it"


Re: Unique key field type in solr 6.1 schema

2016-08-09 Thread Daniel Collins
This vaguely rings a bell, though from a long time ago.  We had our id
field using the "lowercase" type in Solr, and that broke/changed somewhere
in the 4.x series (we are on 4.8.1 now and it doesn't work there), so we
have to revert to a simple "string" type instead.  I know you have a very
different use case, but I don't think its anything to do with CDCR or 6.x,
I think its a "problem" in the 4.x series. You might want to check the 4.x
release notes, and/or try upgrading to 4.10.4 (the latest in the 4.x
series) just to see what the behavior is there, I think it changed
somewhere around 4.4 or 4.6...

But I'm talking probably 2-3 years ago, so my memory is hazy on this.

On 9 August 2016 at 08:51, bharath.mvkumar 
wrote:

> Hi All,
>
> I have an issue with cross data center replication, when we delete the
> document by id from the main site. The target site document is not deleted.
> I have the id field which is a unique field for my schema which is
> configured as "long".
>
> If i change the type to "string" it works fine. Is there any issue using
> long. Because we migrated from 4.4 to 6.1, and we had the id field as long.
> Can you please help me with this. Really appreciate your help.
>
> I see the below error on the target site:-
>
>  o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Invalid
> Number:
>   at org.apache.solr.schema.TrieField.readableToIndexed(
> TrieField.java:537)
> at
> org.apache.solr.update.DeleteUpdateCommand.getIndexedId(
> DeleteUpdateCommand.java:65)
> at
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionDelete(
> DistributedUpdateProcessor.java:1495)
> at
> org.apache.solr.update.processor.CdcrUpdateProcessor.versionDelete(
> CdcrUpdateProcessor.java:85)
>
> Thanks,
> Bharath Kumar
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Unique-key-field-type-in-solr-6-1-schema-tp4290895.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Unique key field type in solr 6.1 schema

2016-08-09 Thread bharath.mvkumar
Hi All,

I have an issue with cross data center replication, when we delete the
document by id from the main site. The target site document is not deleted.
I have the id field which is a unique field for my schema which is
configured as "long". 

If i change the type to "string" it works fine. Is there any issue using
long. Because we migrated from 4.4 to 6.1, and we had the id field as long.
Can you please help me with this. Really appreciate your help.

I see the below error on the target site:-

 o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Invalid
Number:
  at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:537)
at
org.apache.solr.update.DeleteUpdateCommand.getIndexedId(DeleteUpdateCommand.java:65)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionDelete(DistributedUpdateProcessor.java:1495)
at
org.apache.solr.update.processor.CdcrUpdateProcessor.versionDelete(CdcrUpdateProcessor.java:85)

Thanks,
Bharath Kumar



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unique-key-field-type-in-solr-6-1-schema-tp4290895.html
Sent from the Solr - User mailing list archive at Nabble.com.


Unique key field type in solr 6.1 schema

2016-08-07 Thread Bharath Kumar
Hi All,

I have an issue with cross data center replication, when we delete the
document by id from the main site. The target site document is not deleted.
I have the id field which is a unique field for my schema which is
configured as "long".

If i change the type to "string" it works fine. Is there any issue using
long. Because we migrated from 4.4 to 6.1, and we had the id field as long.
Can you please help me with this. Really appreciate your help.

*I see the below error on the target site:-*

 o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Invalid
Number:
  at org.apache.solr.schema.TrieField.readableToIndexed(TrieField.java:537)
at org.apache.solr.update.DeleteUpdateCommand.getIndexedId(
DeleteUpdateCommand.java:65)
at org.apache.solr.update.processor.DistributedUpdateProcessor.
versionDelete(DistributedUpdateProcessor.java:1495)
at org.apache.solr.update.processor.CdcrUpdateProcessor.
versionDelete(CdcrUpdateProcessor.java:85)

-- 
Thanks & Regards,
Bharath MV Kumar

"Life is short, enjoy every moment of it"


Re: Multiple unique key in Schema

2015-11-17 Thread Erik Hatcher
Fair point indeed.  Depends on how your update process works though.  One can 
do the trick of assigning batch numbers to an indexing run and deleting 
documents that aren’t from that reindexing run for example, so it’s not 
necessary to overwrite documents to “replace” them per se.

Erik


> On Nov 17, 2015, at 9:01 AM, Mugeesh Husain  wrote:
> 
>>> Or perhaps use the UUID auto id feature. 
> if i use  UUID, then how i can update particular document, i think using
> this ,there will not any document identity 
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Multiple-unique-key-in-Schema-tp4240550p4240557.html
> Sent from the Solr - User mailing list archive at Nabble.com.



Re: Multiple unique key in Schema

2015-11-17 Thread Mugeesh Husain
>>Or perhaps use the UUID auto id feature. 
if i use  UUID, then how i can update particular document, i think using
this ,there will not any document identity



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Multiple-unique-key-in-Schema-tp4240550p4240563.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Multiple unique key in Schema

2015-11-17 Thread Mugeesh Husain
>>Or perhaps use the UUID auto id feature. 
if i use  UUID, then how i can update particular document, i think using
this ,there will not any document identity 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Multiple-unique-key-in-Schema-tp4240550p4240557.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Multiple unique key in Schema

2015-11-17 Thread Erik Hatcher
Make each document have a composite unique key: user-1, user-2, review-1,... 
Etc. 

Easier said than done if you're just posting the CSV directly to Solr but an 
update script could help. 

Or perhaps use the UUID auto id feature. 

  Erik

> On Nov 17, 2015, at 08:14, Mugeesh Husain  wrote:
> 
> Hi!
> 
> I have a 3 csv table,
> 1.)retuarant 
> 2.)User
> 3.)Review
> 
> every csv have a unique key, then how i can configure multiple unique key in
> solr
> 
> 
> 
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Multiple-unique-key-in-Schema-tp4240550.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Multiple unique key in Schema

2015-11-17 Thread Alexandre Rafalovitch
When you index into Solr, you are overlapping the definitions into one
schema. Therefore, you will need a unified uniqueKey.

There is a couple of approaches:
1) Maybe you don't actually store the data as three types of entities.
Think about what you will want to find and structure the data to
match. Doing JOINS in Solr is a bad idea, even if sometimes possible
2) Make a compositeKey as unique key by adding type prefix to your key
ids when exporting to SQL (select concat('r',id), .)
3) Make a compositeKey as unique key by using UpdateRequestProcessors
to manipulate the value of the uniqueKey field. You'd need three
different update chains to apply different prefixes, but you can pass
the chain name as a request parameter. You can find the full list of
the URPs at: http://www.solr-start.com/info/update-request-processors/

Regards,
Alex.

Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 17 November 2015 at 08:14, Mugeesh Husain  wrote:
> Hi!
>
> I have a 3 csv table,
> 1.)retuarant
> 2.)User
> 3.)Review
>
> every csv have a unique key, then how i can configure multiple unique key in
> solr
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Multiple-unique-key-in-Schema-tp4240550.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Multiple unique key in Schema

2015-11-17 Thread Mugeesh Husain
Hi!

I have a 3 csv table,
1.)retuarant 
2.)User
3.)Review

every csv have a unique key, then how i can configure multiple unique key in
solr



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Multiple-unique-key-in-Schema-tp4240550.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Help on Out of memory when using Cursor with sort on Unique Key

2015-09-09 Thread Naresh Yadav
docvalues with reindexing does not seem viable option for me as of
now...regarding second question on Xmx4G so i tried
various options Xmx8G, Xmx10G, Xmx12G all not worked except Xmx14G which
not seem practical for production with 16gb ram.

While searching i came across :
https://issues.apache.org/jira/browse/SOLR-6121 &
https://issues.apache.org/jira/browse/SOLR-6277
I think i am also in same boat where for Cursor use i have to specify
unique key field in SORT and that leads to heap space.

Please suggest some workaround for now as SOLR-6121, SOLR-627 are still in
discussion.


On Tue, Sep 8, 2015 at 7:42 PM, Raja Pothuganti <
rpothuga...@competitrack.com> wrote:

> Hi Naresh
>
> 1) For 'sort by' fields, have you considered using DocValue=true for in
> schema definition.
> If you  are changing schema definition, you would need redo full reindex
> after backing up & deleting current index from dataDir.
> Also note that, adding docValue=true would increase size of index.
>
> 2)>Each node memory parameter : -Xms2g, -Xmx4g
> What is the basis choosing above memory sizes? Have you observed through
> jconsole or visual vm?
>
> Raja
> On 9/8/15, 8:57 AM, "Naresh Yadav"  wrote:
>
> >Cluster details :
> >
> >Solr Version  : solr-4.10.4
> >No of nodes : 2 each 16 GB RAM
> >Node of shards : 2
> >Replication : 1
> >Each node memory parameter : -Xms2g, -Xmx4g
> >
> >Collection details :
> >
> >No of docs in my collection : 12.31 million
> >Indexed field per document : 2
> >Unique key field : tids
> >Stored filed per document : varies 30- 40
> >Total index size node1+node2 = 13gb+13gb=26gb
> >
> >Query throwing Heap Space : /select?q=*:*&sort=tids+desc&rows=100&fl=tids
> >
> >Query working* : */select?q=*:*&rows=100&fl=tids
> >
> >I am using sort on unique key field tids for Cursor based pagination of
> >100
> >size.
> >
> >Already tried :
> >
> >I also tried tweaking Xmx but problem not solved..
> >I also tried q with criteria of indexed filed with only 4200 hits that
> >also
> >not working
> >when sort parameter included.
> >
> >Please help me here as i am clueless why OOM error in getting 100
> >documents.
> >
> >Thanks
> >Naresh
>
>


Re: Help on Out of memory when using Cursor with sort on Unique Key

2015-09-08 Thread Raja Pothuganti
Hi Naresh

1) For 'sort by' fields, have you considered using DocValue=true for in
schema definition.
If you  are changing schema definition, you would need redo full reindex
after backing up & deleting current index from dataDir.
Also note that, adding docValue=true would increase size of index.

2)>Each node memory parameter : -Xms2g, -Xmx4g
What is the basis choosing above memory sizes? Have you observed through
jconsole or visual vm?

Raja
On 9/8/15, 8:57 AM, "Naresh Yadav"  wrote:

>Cluster details :
>
>Solr Version  : solr-4.10.4
>No of nodes : 2 each 16 GB RAM
>Node of shards : 2
>Replication : 1
>Each node memory parameter : -Xms2g, -Xmx4g
>
>Collection details :
>
>No of docs in my collection : 12.31 million
>Indexed field per document : 2
>Unique key field : tids
>Stored filed per document : varies 30- 40
>Total index size node1+node2 = 13gb+13gb=26gb
>
>Query throwing Heap Space : /select?q=*:*&sort=tids+desc&rows=100&fl=tids
>
>Query working* : */select?q=*:*&rows=100&fl=tids
>
>I am using sort on unique key field tids for Cursor based pagination of
>100
>size.
>
>Already tried :
>
>I also tried tweaking Xmx but problem not solved..
>I also tried q with criteria of indexed filed with only 4200 hits that
>also
>not working
>when sort parameter included.
>
>Please help me here as i am clueless why OOM error in getting 100
>documents.
>
>Thanks
>Naresh



Help on Out of memory when using Cursor with sort on Unique Key

2015-09-08 Thread Naresh Yadav
Cluster details :

Solr Version  : solr-4.10.4
No of nodes : 2 each 16 GB RAM
Node of shards : 2
Replication : 1
Each node memory parameter : -Xms2g, -Xmx4g

Collection details :

No of docs in my collection : 12.31 million
Indexed field per document : 2
Unique key field : tids
Stored filed per document : varies 30- 40
Total index size node1+node2 = 13gb+13gb=26gb

Query throwing Heap Space : /select?q=*:*&sort=tids+desc&rows=100&fl=tids

Query working* : */select?q=*:*&rows=100&fl=tids

I am using sort on unique key field tids for Cursor based pagination of 100
size.

Already tried :

I also tried tweaking Xmx but problem not solved..
I also tried q with criteria of indexed filed with only 4200 hits that also
not working
when sort parameter included.

Please help me here as i am clueless why OOM error in getting 100 documents.

Thanks
Naresh


Re: Getting unique key of a document inside of a Similarity class.

2015-02-20 Thread J-Pro

from all the examples of what you've described, i'm fairly certain all you
really need is a TFIDF based Similarity where coord(), idf(), tf() and
queryNorm() return 1 allways, and you omitNorms from all fields.


Yeah, that's what I did in the very first iteration. It works only for 
cases #1 and #2. If you try query 3 and 4 with such Similarity, you'll get:


3. place:(34\ High\ Street)^3 => doc1(score=9), doc2(score=9)
4. name:DocumentOne^7 OR place:(34\ High\ Street)^3 => doc1(score=16), 
doc2(score=9)


That is not what I need. As I described above, in case of multiple 
tokens match for a field, method SimScorer.score is called X times, 
where X is number of matched tokens (in cases #3 and #4 there are 3 
tokens), therefore score sums up. I need to score only once in this 
case, regardless of number of tokens.


How to do it? First idea was HashSet based on fieldName, so that after 
scoring once, it don't score anymore. But in this case only first 
document was scoring (since second and other documents have the same 
field name). So I understood that I need also docID for that. And it 
worked fine until I found out (thank you for that) about that docID is 
segment-specific. So now I need segmentID as well (or something similar).




(You didn't give any examples of what you expect to happen with exclusion
clauses in your BooleanQueries


For my needs I won't need exclusion clauses, but in this case the same 
would happen - it would score depending on weight, because condition is 
true:


5. (NOT name:DocumentOne)^7 => doc2(score=7)


Re: Getting unique key of a document inside of a Similarity class.

2015-02-19 Thread Chris Hostetter

: 1. name:DocumentOne^7 => doc1(score=7)
: 2. name:DocumentOne^7 AND place:notExist^3 => doc1(score=7)
: 3. place:(34\ High\ Street)^3 => doc1(score=3), doc2(score=3)
: 4. name:DocumentOne^7 OR place:(34\ High\ Street)^3 => doc1(score=10),
: doc2(score=3)
...
: > it's not clear why you need any sort of unique document identification for
: > you scoring algorithm .. from what you described, matches on fieldA should
: > get score "A" matches on fieldB should get score "B" ... why does it mater
: > which doc is which?
: 
: For case #3, for example, method SimScorer.score is called 3 times for each of
: these documents, total 6 times for both. I have added a
: ThreadLocal> to my custom similarity, which is cleared every
: time before new scoring session (after each query execution). This HashSet
: stores strings consisting of fieldName + docID. Every time score() is called,

Ah HA! ... this is why it's an XY problem... you've decided that you need 
a unique identifier for each doc so you can maintain a HashSet of all the 
times a doc matches a term in the query so you can count them ... you 
don't need to do any of that.

from all the examples of what you've described, i'm fairly certain all you 
really need is a TFIDF based Similarity where coord(), idf(), tf() and 
queryNorm() return 1 allways, and you omitNorms from all fields.

that's it ... that should literally be everything you need to do.

(You didn't give any examples of what you expect to happen with exclusion 
clauses in your BooleanQueries, but the approach you were describing 
wouldn't give you any aded advantages towards interesting MUST_NOT clauses 
either ... it would in fact only increase the scores for those docs in a 
way that is almost certainly not what you want)


-Hoss
http://www.lucidworks.com/


Re: Getting unique key of a document inside of a Similarity class.

2015-02-19 Thread J-Pro

how are you defining/specifying these field weights?


I define weights inside of a query (name:SomeName^7).



it would help if you could give a concrete example of some sample docs, a
sample query, and what results you would expect ... the sample input and
sample output of the system you are interested in.


Sure. Imagine we have 2 docs:

doc1
-
name:DocumentOne
place:34 High Street (StandardTokenizerFactory, i.e. 3 tokens created)

doc2
-
name:DocumentTwo
place:34 High Street (StandardTokenizerFactory, i.e. 3 tokens created)

I want the following queries return docs with scores:

1. name:DocumentOne^7 => doc1(score=7)
2. name:DocumentOne^7 AND place:notExist^3 => doc1(score=7)
3. place:(34\ High\ Street)^3 => doc1(score=3), doc2(score=3)
4. name:DocumentOne^7 OR place:(34\ High\ Street)^3 => doc1(score=10), 
doc2(score=3)



If you're curious about why do I need it, i.e. about my very initial 
"problem X", then I need this scoring to be able to calculate matching 
percentage. That's a separate topic, I read a lot about it (including 
http://wiki.apache.org/lucene-java/ScoresAsPercentages) and people say 
it's either not doable or very-very complicated with SOLR. So I just 
want to give it a try. For case #3 from above matching percentage is 
100% for both docs. For case #4 it's doc1:100% and doc2:30%.




it's not clear why you need any sort of unique document identification for
you scoring algorithm .. from what you described, matches on fieldA should
get score "A" matches on fieldB should get score "B" ... why does it mater
which doc is which?


For case #3, for example, method SimScorer.score is called 3 times for 
each of these documents, total 6 times for both. I have added a 
ThreadLocal> to my custom similarity, which is cleared 
every time before new scoring session (after each query execution). This 
HashSet stores strings consisting of fieldName + docID. Every time 
score() is called, I check this HashSet - if fieldName + docID exists, I 
return 0 as score, otherwise field weight.
If there was no docID in this string (only field name), then case #3 
would return the following: doc1(score=3), doc2(score=0). If there was 
no HashSet at all, case #3 would return: doc1(score=9), doc2(score=9) 
since query matched all 3 tokens for every doc.


I know that what I'm doing is a "hack", but that's the only way I've 
found so far to implement percentage matching. I just want to play 
around with it, see how it performs and decide whether to use it or not. 
But for that I need to uniquely identify a document while scoring :)


Re: Getting unique key of a document inside of a Similarity class.

2015-02-19 Thread Chris Hostetter

: Sure, sorry I did not do it before, I just wanted to take minimum of your
: valuable time. So in my custom Similarity class I am trying to implement such
: a logic, where score calculation is only based on field weight and a field
: match - that's it. In other words, if a field matches the query, I want
: "score" method to return this field's weight only, regardless of factors like:
: norms; coord; doc frequencies; fact that field was multivalued and more than
: one value matched; fact that field was tokenized as multiple tokens and more
: than one token matched, etc. As far as I know, there is no such a similarity
: in list of existing ones.

how are you defining/specifying these field weights?

it would help if you could give a concrete example of some sample docs, a 
sample query, and what results you would expect ... the sample input and 
sample output of the system you are interested in.

: In order to implement this, I am trying to score only once for a combination
: of a specific field + doc unique identifier. And I don't care what is this
: unique doc identifier - it can be unique key or it can be internal doc ID.

it's not clear why you need any sort of unique document identification for 
you scoring algorithm .. from what you described, matches on fieldA should 
get score "A" matches on fieldB should get score "B" ... why does it mater 
which doc is which?



-Hoss
http://www.lucidworks.com/


Re: Getting unique key of a document inside of a Similarity class.

2015-02-19 Thread J-Pro
Thank you for your answer, Chris. I will reply with inline comments as 
well. Please see below.



: I need to uniquely identify a document inside of a Similarity class during
: scoring. Is it possible to get value of unique key of a document at this
: point?

Can you tell us a bit more about your usecase ... your problem description
is a bit vague, and sounds like it may be an "XY Problem"...


Sure, sorry I did not do it before, I just wanted to take minimum of 
your valuable time. So in my custom Similarity class I am trying to 
implement such a logic, where score calculation is only based on field 
weight and a field match - that's it. In other words, if a field matches 
the query, I want "score" method to return this field's weight only, 
regardless of factors like: norms; coord; doc frequencies; fact that 
field was multivalued and more than one value matched; fact that field 
was tokenized as multiple tokens and more than one token matched, etc. 
As far as I know, there is no such a similarity in list of existing ones.
In order to implement this, I am trying to score only once for a 
combination of a specific field + doc unique identifier. And I don't 
care what is this unique doc identifier - it can be unique key or it can 
be internal doc ID.
I had my implementation working, but as I understood from your answer, I 
had it working only for one segment. So now I need to add segment ID or 
something like this to my combination.




Assuming the method you are refering to (you didn't give a specific
class/interface name) is SimScorer.score(doc,req) then the javadocs say...

 doc - document id within the inverted index segment
 freq - sloppy term frequency

...so for #1, yes this is definitely the per-segment docId.


Yes, it's ExactSimScorer.score(int doc, int freq). Ah! Per segment! Here 
we go, then I understand why it's 0 every new commit! SOLR doc says new 
docs are written to a new segment. Then question #1 is clear for me. 
Thanks, Chris!




for #2: the methor for providing a SimScorer to lucene is by implementing
Similarity.simScorer(...) -- that method gets as an argument an
AtomicReaderContext context, which not only has an AtomicReader for the
individual segment, but also details about that segments role in the
larger index.


Interesting details, that may be exactly what I need. If I can somehow 
uniquely identify a document using its internal doc id + data from 
context (like segment id or something), that would be awesome. I have 
checked AtomicReaderContext, it has 'ord' (The readers ord in the 
top-level's leaves array) and 'docBase' (The readers absolute doc base) 
- probably what I need. Do you have any more information (maybe links to 
wikis) about this AtomicReaderContext, DocValues, "low" and "top" levels 
(other than javadoc in source code)? I have a high-level understanding, 
but it's obviously not enough for the problem I am solving. I would be 
more than happy to understand it.


Thank you very much for your time, Chris and other people who spend time 
on reading/answering this thread!


Re: Getting unique key of a document inside of a Similarity class.

2015-02-19 Thread Chris Hostetter

: I need to uniquely identify a document inside of a Similarity class during
: scoring. Is it possible to get value of unique key of a document at this
: point?

Can you tell us a bit more about your usecase ... your problem description 
is a bit vague, and sounds like it may be an "XY Problem"...

https://people.apache.org/~hossman/#xyproblem
Your question appears to be an "XY Problem" ... that is: you are dealing
with "X", you are assuming "Y" will help you, and you are asking about "Y"
without giving more details about the "X" so that we can understand the
full issue.  Perhaps the best solution doesn't involve "Y" at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341

: 1. Is docIds behavior described above a bug or a feature? Obviously, if it's a
: bug and I can use docID to uniquely identify a document, then my question is
: answered after this bug is fixed.
: 2. If docIds behavior described above is normal, then what is an alternative
: way of uniquely identify a document inside of a Similarity class during
: scoring? Can I get unique key of a scoring document in Similarity?

Assuming the method you are refering to (you didn't give a specific 
class/interface name) is SimScorer.score(doc,req) then the javadocs say...

doc - document id within the inverted index segment
freq - sloppy term frequency

...so for #1, yes this is definitely the per-segment docId.

for #2: the methor for providing a SimScorer to lucene is by implementing 
Similarity.simScorer(...) -- that method gets as an argument an 
AtomicReaderContext context, which not only has an AtomicReader for the 
individual segment, but also details about that segments role in the 
larger index.

As far as getting the Solr uniqueKey ... it's non trivial, and there are 
different things you could do depending on what your ultimate goal is (ie: 
see my earlier question about XY problem) ... my guess is from this low 
level down in the code you want to use DocValues (aka: FieldCache in older 
versions of lucene) on your uniqueKey field, then ask it for the 
fieldvalue of each internal docId that gets passed to your method -- 
either by using the per-segment DocValues, or by using the 
AtomicReaderContext's base information to determine the "top level" 
internal docId and use the "top level" DocValues/FieldCache

(the per-segment vs "top level" DocValues and internalId stuff can be kind 
of confusing -- start with whichever seems simpler based on your 
understanding of the internal lucene/solr APIs and worry about maybe 
switching to the other approach later once you have something working and 
see if it helps or hinders performance for your usecases)

-Hoss
http://www.lucidworks.com/


Getting unique key of a document inside of a Similarity class.

2015-02-19 Thread J-Pro

Good afternoon.

I need to uniquely identify a document inside of a Similarity class 
during scoring. Is it possible to get value of unique key of a document 
at this point?


For some time I though I can use internal docID for achieving that. 
Method score(int doc, float freq) is called after every query execution 
for each matched doc. For each indexed doc it equals 0, 1, 2, etc. But 
this is only when documents indexed in a bulk, i.e. in single HTTP 
request. But when docs are indexed in separate requests, these docIds 
equal 0 for all documents.


To summarize, here are 2 final questions:

1. Is docIds behavior described above a bug or a feature? Obviously, if 
it's a bug and I can use docID to uniquely identify a document, then my 
question is answered after this bug is fixed.
2. If docIds behavior described above is normal, then what is an 
alternative way of uniquely identify a document inside of a Similarity 
class during scoring? Can I get unique key of a scoring document in 
Similarity?


FYI: I have asked 1st question in #solr IRC channel. The person named 
hoss answered the following: "you're seeing the *internal* docIds ... 
you can't assign any special meaning to them ... i believe that at the 
level of the Similarity class, these may even be per segment, which 
means that in the context of a SegmentReader they can be used to get 
things like docValues, but they odn't have any meaning compared to your 
uniqueKey (for example)". This kinda makes me think that answer for the 
1st question is "it's a feature". But I am still not sure and don't know 
the answer to the 2nd question. Please help.


Thank you very much in advance.


Re: Solr Composite Unique key from existing fields in schema

2014-12-09 Thread Ahmet Arslan
Hi,

Once I used template transformer to generate unique id across entities.

http://wiki.apache.org/solr/DataImportHandler#TemplateTransformer




On Wednesday, December 10, 2014 8:51 AM, Rajesh Panneerselvam 
 wrote:
Hi,
I'm using DIH to index my entities. I'm facing an issue while delta-import. 
I've declared multiple entities in one data-config.xml. The entities will have 
different primary key. Now if I want to delta-import how should I mention the 
UniqueKey in schema.xml.
My data-config structure is like this

  
  
  



Thanks
Rajesh
[Aspire Systems]

This e-mail message and any attachments are for the sole use of the intended 
recipient(s) and may contain proprietary, confidential, trade secret or 
privileged information. Any unauthorized review, use, disclosure or 
distribution is prohibited and may be a violation of law. If you are not the 
intended recipient, please contact the sender by reply e-mail and destroy all 
copies of the original message. 


Solr Composite Unique key from existing fields in schema

2014-12-09 Thread Rajesh Panneerselvam
Hi,
I'm using DIH to index my entities. I'm facing an issue while delta-import. 
I've declared multiple entities in one data-config.xml. The entities will have 
different primary key. Now if I want to delta-import how should I mention the 
UniqueKey in schema.xml.
My data-config structure is like this

  
  
  



Thanks
Rajesh
[Aspire Systems]

This e-mail message and any attachments are for the sole use of the intended 
recipient(s) and may contain proprietary, confidential, trade secret or 
privileged information. Any unauthorized review, use, disclosure or 
distribution is prohibited and may be a violation of law. If you are not the 
intended recipient, please contact the sender by reply e-mail and destroy all 
copies of the original message.


Re: fl rename of unique key in solrcloud

2014-11-16 Thread Suchi Amalapurapu
Thx Jeon. That worked. Now both the fields are returned in the response.
Its a bit inefficient but works neverthless.
Suchi

On Sat, Nov 15, 2014 at 10:44 PM, Jeon Woosung 
wrote:

> I guess that I caused by shard which return renamed field.
>
> following code is source code of solr 4.6
> 
> ===
> 986:if ((sreq.purpose & ShardRequest.PURPOSE_GET_FIELDS) != 0) {
> 987:  boolean returnScores = (rb.getFieldFlags() &
> SolrIndexSearcher.GET_SCORES) != 0;
> 988:
> 989:  assert(sreq.responses.size() == 1);
> 990:  ShardResponse srsp = sreq.responses.get(0);
> 991:  SolrDocumentList docs =
> (SolrDocumentList)srsp.getSolrResponse().getResponse().get("response");
> 992:
> 993:  String keyFieldName =
> rb.req.getSchema().getUniqueKeyField().getName();
> 994:  boolean removeKeyField =
> !rb.rsp.getReturnFields().wantsField(keyFieldName);
> 995:
> 996:  for (SolrDocument doc : docs) {
> 997:Object id = doc.getFieldValue(keyFieldName);
> 998:ShardDoc sdoc = rb.resultIds.get(id.toString());
>
> 
>
> If each shard return renamed field name instead of keyFieldName(UniqueKey),
> "id" of 998 line could be null. Because the doc of 996 line wouldn't have
> "keyFieldName"
>
>
> So if you are urgent or you can not wait for patch, you can add unique
> field like this.
> eg) http://
> /solr//select?q=dress&fl=a1:p1&fl=p1
>
>
>
>
> On Sat, Nov 15, 2014 at 11:26 PM, Garth Grimm <
> garthgr...@averyranchconsulting.com> wrote:
>
> > https://issues.apache.org/jira/browse/SOLR-6744 created.
> >
> > And hopefully correctly, since that’s my first.
> > On Nov 15, 2014, at 9:12 AM, Garth Grimm <
> > garthgr...@averyranchconsulting.com > garthgr...@averyranchconsulting.com>> wrote:
> >
> > I see the same issue on 4.10.1.
> >
> > I’ll open a JIRA if I don’t see one.
> >
> > I guess the best immediate work around is to copy the unique field, and
> > use that field for renaming?
> > On Nov 15, 2014, at 3:18 AM, Suchi Amalapurapu  > <mailto:su...@bloomreach.com>> wrote:
> >
> > Solr version:4.6.1
> >
> > On Sat, Nov 15, 2014 at 12:24 PM, Jeon Woosung  > <mailto:jeonwoos...@gmail.com>>
> > wrote:
> >
> > Could you let me know version of the solr?
> >
> > On Sat, Nov 15, 2014 at 5:05 AM, Suchi Amalapurapu  > <mailto:su...@bloomreach.com>>
> > wrote:
> >
> > Hi
> > Getting the following exception when using fl renaming with unique key in
> > the schema.
> > http:///solr//select?q=dress&fl=a1:p1
> >
> > where p1 is the unique key for 
> > For collections with single shard, this works flawlessly but results in
> > the
> > following exception in case of multiple shards.
> >
> > How do we fix this? Stack trace below.
> > Suchi
> >
> > error": {"trace": "java.lang.NullPointerException\n\tat
> >
> >
> >
> >
> org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:998)\n\tat
> >
> >
> >
> >
> org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:653)\n\tat
> >
> >
> >
> >
> org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:628)\n\tat
> >
> >
> >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)\n\tat
> >
> >
> >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat
> > org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)\n\tat
> >
> >
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:721)\n\tat
> >
> >
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)\n\tat
> >
> >
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)\n\tat
> >
> >
> >
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\n\tat
> >
> >
> >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)\n\tat
> >
> >
> >
> >
> org.eclipse.jetty.server

Re: fl rename of unique key in solrcloud

2014-11-15 Thread Jeon Woosung
I guess that I caused by shard which return renamed field.

following code is source code of solr 4.6

===
986:if ((sreq.purpose & ShardRequest.PURPOSE_GET_FIELDS) != 0) {
987:  boolean returnScores = (rb.getFieldFlags() &
SolrIndexSearcher.GET_SCORES) != 0;
988:
989:  assert(sreq.responses.size() == 1);
990:  ShardResponse srsp = sreq.responses.get(0);
991:  SolrDocumentList docs =
(SolrDocumentList)srsp.getSolrResponse().getResponse().get("response");
992:
993:  String keyFieldName =
rb.req.getSchema().getUniqueKeyField().getName();
994:  boolean removeKeyField =
!rb.rsp.getReturnFields().wantsField(keyFieldName);
995:
996:  for (SolrDocument doc : docs) {
997:Object id = doc.getFieldValue(keyFieldName);
998:ShardDoc sdoc = rb.resultIds.get(id.toString());


If each shard return renamed field name instead of keyFieldName(UniqueKey),
"id" of 998 line could be null. Because the doc of 996 line wouldn't have
"keyFieldName"


So if you are urgent or you can not wait for patch, you can add unique
field like this.
eg) http:///solr//select?q=dress&fl=a1:p1&fl=p1




On Sat, Nov 15, 2014 at 11:26 PM, Garth Grimm <
garthgr...@averyranchconsulting.com> wrote:

> https://issues.apache.org/jira/browse/SOLR-6744 created.
>
> And hopefully correctly, since that’s my first.
> On Nov 15, 2014, at 9:12 AM, Garth Grimm <
> garthgr...@averyranchconsulting.com garthgr...@averyranchconsulting.com>> wrote:
>
> I see the same issue on 4.10.1.
>
> I’ll open a JIRA if I don’t see one.
>
> I guess the best immediate work around is to copy the unique field, and
> use that field for renaming?
> On Nov 15, 2014, at 3:18 AM, Suchi Amalapurapu  <mailto:su...@bloomreach.com>> wrote:
>
> Solr version:4.6.1
>
> On Sat, Nov 15, 2014 at 12:24 PM, Jeon Woosung  <mailto:jeonwoos...@gmail.com>>
> wrote:
>
> Could you let me know version of the solr?
>
> On Sat, Nov 15, 2014 at 5:05 AM, Suchi Amalapurapu  <mailto:su...@bloomreach.com>>
> wrote:
>
> Hi
> Getting the following exception when using fl renaming with unique key in
> the schema.
> http:///solr//select?q=dress&fl=a1:p1
>
> where p1 is the unique key for 
> For collections with single shard, this works flawlessly but results in
> the
> following exception in case of multiple shards.
>
> How do we fix this? Stack trace below.
> Suchi
>
> error": {"trace": "java.lang.NullPointerException\n\tat
>
>
>
> org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:998)\n\tat
>
>
>
> org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:653)\n\tat
>
>
>
> org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:628)\n\tat
>
>
>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)\n\tat
>
>
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat
> org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)\n\tat
>
>
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:721)\n\tat
>
>
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)\n\tat
>
>
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)\n\tat
>
>
>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\n\tat
>
>
>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)\n\tat
>
>
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)\n\tat
>
>
>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)\n\tat
>
>
>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)\n\tat
>
>
>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)\n\tat
>
>
>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)\n\tat
>
>
>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)\n\tat
>
>
>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)\n\tat
>
>
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)\n\tat
>
>
>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)\n\tat
>
>
&g

Re: fl rename of unique key in solrcloud

2014-11-15 Thread Garth Grimm
https://issues.apache.org/jira/browse/SOLR-6744 created.

And hopefully correctly, since that’s my first.
On Nov 15, 2014, at 9:12 AM, Garth Grimm 
mailto:garthgr...@averyranchconsulting.com>>
 wrote:

I see the same issue on 4.10.1.

I’ll open a JIRA if I don’t see one.

I guess the best immediate work around is to copy the unique field, and use 
that field for renaming?
On Nov 15, 2014, at 3:18 AM, Suchi Amalapurapu 
mailto:su...@bloomreach.com>> wrote:

Solr version:4.6.1

On Sat, Nov 15, 2014 at 12:24 PM, Jeon Woosung 
mailto:jeonwoos...@gmail.com>>
wrote:

Could you let me know version of the solr?

On Sat, Nov 15, 2014 at 5:05 AM, Suchi Amalapurapu 
mailto:su...@bloomreach.com>>
wrote:

Hi
Getting the following exception when using fl renaming with unique key in
the schema.
http:///solr//select?q=dress&fl=a1:p1

where p1 is the unique key for 
For collections with single shard, this works flawlessly but results in
the
following exception in case of multiple shards.

How do we fix this? Stack trace below.
Suchi

error": {"trace": "java.lang.NullPointerException\n\tat


org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:998)\n\tat


org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:653)\n\tat


org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:628)\n\tat


org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)\n\tat


org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)\n\tat


org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:721)\n\tat


org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)\n\tat


org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)\n\tat


org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\n\tat


org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)\n\tat


org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)\n\tat


org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)\n\tat


org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)\n\tat


org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)\n\tat


org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)\n\tat


org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)\n\tat


org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)\n\tat


org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)\n\tat


org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)\n\tat


org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)\n\tat


org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)\n\tat
org.eclipse.jetty.server.Server.handle(Server.java:368)\n\tat


org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)\n\tat


org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)\n\tat


org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)\n\tat


org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)\n\tat
org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)\n\tat

org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)\n\tat


org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)\n\tat


org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)\n\tat


org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)\n\tat


org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)\n\tat
java.lang.Thread.run(Thread.java:662)\n","code": 500




--
*God bless U*





Re: fl rename of unique key in solrcloud

2014-11-15 Thread Garth Grimm
I see the same issue on 4.10.1.

I’ll open a JIRA if I don’t see one.

I guess the best immediate work around is to copy the unique field, and use 
that field for renaming?
> On Nov 15, 2014, at 3:18 AM, Suchi Amalapurapu  wrote:
> 
> Solr version:4.6.1
> 
> On Sat, Nov 15, 2014 at 12:24 PM, Jeon Woosung 
> wrote:
> 
>> Could you let me know version of the solr?
>> 
>> On Sat, Nov 15, 2014 at 5:05 AM, Suchi Amalapurapu 
>> wrote:
>> 
>>> Hi
>>> Getting the following exception when using fl renaming with unique key in
>>> the schema.
>>> http:///solr//select?q=dress&fl=a1:p1
>>> 
>>> where p1 is the unique key for 
>>> For collections with single shard, this works flawlessly but results in
>> the
>>> following exception in case of multiple shards.
>>> 
>>> How do we fix this? Stack trace below.
>>> Suchi
>>> 
>>> error": {"trace": "java.lang.NullPointerException\n\tat
>>> 
>>> 
>> org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:998)\n\tat
>>> 
>>> 
>> org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:653)\n\tat
>>> 
>>> 
>> org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:628)\n\tat
>>> 
>>> 
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)\n\tat
>>> 
>>> 
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat
>>> org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)\n\tat
>>> 
>>> 
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:721)\n\tat
>>> 
>>> 
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)\n\tat
>>> 
>>> 
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)\n\tat
>>> org.eclipse.jetty.server.Server.handle(Server.java:368)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)\n\tat
>>> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)\n\tat
>>> 
>> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)\n\tat
>>> 
>>> 
>> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)\n\tat
>>> java.lang.Thread.run(Thread.java:662)\n","code": 500
>>> 
>> 
>> 
>> 
>> --
>> *God bless U*
>> 



Re: fl rename of unique key in solrcloud

2014-11-15 Thread Suchi Amalapurapu
Solr version:4.6.1

On Sat, Nov 15, 2014 at 12:24 PM, Jeon Woosung 
wrote:

> Could you let me know version of the solr?
>
> On Sat, Nov 15, 2014 at 5:05 AM, Suchi Amalapurapu 
> wrote:
>
> > Hi
> > Getting the following exception when using fl renaming with unique key in
> > the schema.
> > http:///solr//select?q=dress&fl=a1:p1
> >
> > where p1 is the unique key for 
> > For collections with single shard, this works flawlessly but results in
> the
> > following exception in case of multiple shards.
> >
> > How do we fix this? Stack trace below.
> > Suchi
> >
> > error": {"trace": "java.lang.NullPointerException\n\tat
> >
> >
> org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:998)\n\tat
> >
> >
> org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:653)\n\tat
> >
> >
> org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:628)\n\tat
> >
> >
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)\n\tat
> >
> >
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat
> > org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)\n\tat
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:721)\n\tat
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)\n\tat
> >
> >
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)\n\tat
> >
> >
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\n\tat
> >
> >
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)\n\tat
> >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)\n\tat
> >
> >
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)\n\tat
> >
> >
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)\n\tat
> >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)\n\tat
> >
> >
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)\n\tat
> >
> >
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)\n\tat
> >
> >
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)\n\tat
> >
> >
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)\n\tat
> >
> >
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)\n\tat
> >
> >
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)\n\tat
> >
> >
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)\n\tat
> > org.eclipse.jetty.server.Server.handle(Server.java:368)\n\tat
> >
> >
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)\n\tat
> >
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)\n\tat
> >
> >
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)\n\tat
> >
> >
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)\n\tat
> > org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)\n\tat
> >
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)\n\tat
> >
> >
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)\n\tat
> >
> >
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)\n\tat
> >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)\n\tat
> >
> >
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)\n\tat
> > java.lang.Thread.run(Thread.java:662)\n","code": 500
> >
>
>
>
> --
> *God bless U*
>


Re: fl rename of unique key in solrcloud

2014-11-14 Thread Jeon Woosung
Could you let me know version of the solr?

On Sat, Nov 15, 2014 at 5:05 AM, Suchi Amalapurapu 
wrote:

> Hi
> Getting the following exception when using fl renaming with unique key in
> the schema.
> http:///solr//select?q=dress&fl=a1:p1
>
> where p1 is the unique key for 
> For collections with single shard, this works flawlessly but results in the
> following exception in case of multiple shards.
>
> How do we fix this? Stack trace below.
> Suchi
>
> error": {"trace": "java.lang.NullPointerException\n\tat
>
> org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:998)\n\tat
>
> org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:653)\n\tat
>
> org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:628)\n\tat
>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)\n\tat
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat
> org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)\n\tat
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:721)\n\tat
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)\n\tat
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)\n\tat
>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\n\tat
>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)\n\tat
>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)\n\tat
>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)\n\tat
>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)\n\tat
>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)\n\tat
>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)\n\tat
>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)\n\tat
>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)\n\tat
> org.eclipse.jetty.server.Server.handle(Server.java:368)\n\tat
>
> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)\n\tat
>
> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)\n\tat
>
> org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)\n\tat
>
> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)\n\tat
> org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)\n\tat
> org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)\n\tat
>
> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)\n\tat
>
> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)\n\tat
>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)\n\tat
>
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)\n\tat
> java.lang.Thread.run(Thread.java:662)\n","code": 500
>



-- 
*God bless U*


fl rename of unique key in solrcloud

2014-11-14 Thread Suchi Amalapurapu
Hi
Getting the following exception when using fl renaming with unique key in
the schema.
http:///solr//select?q=dress&fl=a1:p1

where p1 is the unique key for 
For collections with single shard, this works flawlessly but results in the
following exception in case of multiple shards.

How do we fix this? Stack trace below.
Suchi

error": {"trace": "java.lang.NullPointerException\n\tat
org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:998)\n\tat
org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:653)\n\tat
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:628)\n\tat
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311)\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:721)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:417)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:201)\n\tat
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)\n\tat
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)\n\tat
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)\n\tat
org.eclipse.jetty.server.Server.handle(Server.java:368)\n\tat
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)\n\tat
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)\n\tat
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)\n\tat
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)\n\tat
org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)\n\tat
org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)\n\tat
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)\n\tat
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)\n\tat
java.lang.Thread.run(Thread.java:662)\n","code": 500


Re: duplicate unique key after partial update in solr 4.10

2014-10-06 Thread Mikhail Khludnev
It seems like by-design
https://issues.apache.org/jira/browse/SOLR-5211
you can't update a parent doc from the block.

On Tue, Oct 7, 2014 at 9:44 AM, Ali Nazemian  wrote:

> The list of docs before do partial update:
> 
> product01
> car
> product
> 
> part01
> wheels
> part
> 
> 
> part02
> engine
> part
> 
> 
> part03
> brakes
> part
> 
> 
> 
> product02
> truck
> product
> 
> part04
> wheels
> part
> 
> 
> part05
> flaps
> part
> 
> 
>
> The list of docs after doing partial update of field read_flag for document
> "product01":
> 
> product01
> car
> product
> true
> 
> part01
> wheels
> part
> 
> 
> part02
> engine
> part
> 
> 
> part03
> brakes
> part
> 
> 
> 
> product02
> truck
> product
> 
> part04
> wheels
> part
> 
> 
> part05
> flaps
> part
> 
> 
>
> The list of documents after sending same documents again. (it should
> overwrite on the last one because of duplicate IDs)
>
> product01
> car
> product
> true
>   
>   
> product01
> car
> product
> 
> part01
> wheels
> part
> 
> 
> part02
> engine
> part
> 
> 
> part03
> brakes
> part
> 
> 
> 
> product02
> truck
> product
> 
> part04
> wheels
> part
> 
> 
> part05
> flaps
> part
> 
> 
>
> But as you can see there are two different version of documents with the
> same ID (which is product01).
>
> Regards.
>
> On Mon, Oct 6, 2014 at 8:18 PM, Alexandre Rafalovitch 
> wrote:
>
> > Can you upload the update documents then (into a Gist or similar).
> > Just so that people didn't have to re-imagine exact steps. Because, if
> > it fully checks out, it might be a bug and the next step would be
> > creating a JIRA ticket.
> >
> > Regards,
> >Alex.
> > Personal: http://www.outerthoughts.com/ and @arafalov
> > Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> > Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
> >
> >
> > On 6 October 2014 11:23, Ali Nazemian  wrote:
> > > Dear Alex,
> > > Hi,
> > > LOL, yeah I am sure. You can test it yourself. I did that on default
> > schema
> > > too. The results are same!
> > > Regards.
> > >
> > > On Mon, Oct 6, 2014 at 4:20 PM, Alexandre Rafalovitch <
> > arafa...@gmail.com>
> > > wrote:
> > >
> > >> A stupid question: Are you sure that what schema thinks your uniqueId
> > >> is - is the uniqueId in your setup? Also, that you are not somehow
> > >> using the flags to tell Solr to ignore duplicates?
> > >>
> > >> Regards,
> > >>Alex.
> > >> Personal: http://www.outerthoughts.com/ and @arafalov
> > >> Solr resources and newsletter: http://www.solr-start.com/ and
> > @solrstart
> > >> Solr popularizers community:
> > https://www.linkedin.com/groups?gid=6713853
> > >>
> > >>
> > >> On 6 October 2014 03:40, Ali Nazemian  wrote:
> > >> > Dear all,
> > >> > Hi,
> > >> > I am goi

Re: duplicate unique key after partial update in solr 4.10

2014-10-06 Thread Ali Nazemian
The list of docs before do partial update:

product01
car
product

part01
wheels
part


part02
engine
part


part03
brakes
part



product02
truck
product

part04
wheels
part


part05
flaps
part



The list of docs after doing partial update of field read_flag for document
"product01":

product01
car
product
true

part01
wheels
part


part02
engine
part


part03
brakes
part



product02
truck
product

part04
wheels
part


part05
flaps
part



The list of documents after sending same documents again. (it should
overwrite on the last one because of duplicate IDs)
   
product01
car
product
true
  
  
product01
car
product

part01
wheels
part


part02
engine
part


part03
brakes
part



product02
truck
product

part04
wheels
part


part05
flaps
part



But as you can see there are two different version of documents with the
same ID (which is product01).

Regards.

On Mon, Oct 6, 2014 at 8:18 PM, Alexandre Rafalovitch 
wrote:

> Can you upload the update documents then (into a Gist or similar).
> Just so that people didn't have to re-imagine exact steps. Because, if
> it fully checks out, it might be a bug and the next step would be
> creating a JIRA ticket.
>
> Regards,
>Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>
>
> On 6 October 2014 11:23, Ali Nazemian  wrote:
> > Dear Alex,
> > Hi,
> > LOL, yeah I am sure. You can test it yourself. I did that on default
> schema
> > too. The results are same!
> > Regards.
> >
> > On Mon, Oct 6, 2014 at 4:20 PM, Alexandre Rafalovitch <
> arafa...@gmail.com>
> > wrote:
> >
> >> A stupid question: Are you sure that what schema thinks your uniqueId
> >> is - is the uniqueId in your setup? Also, that you are not somehow
> >> using the flags to tell Solr to ignore duplicates?
> >>
> >> Regards,
> >>Alex.
> >> Personal: http://www.outerthoughts.com/ and @arafalov
> >> Solr resources and newsletter: http://www.solr-start.com/ and
> @solrstart
> >> Solr popularizers community:
> https://www.linkedin.com/groups?gid=6713853
> >>
> >>
> >> On 6 October 2014 03:40, Ali Nazemian  wrote:
> >> > Dear all,
> >> > Hi,
> >> > I am going to do partial update on a field that has not any value.
> >> Suppose
> >> > I have a document with document id (unique key) '12345' and field
> >> > "read_flag" which does not index at the first place. So the read_flag
> >> field
> >> > for this document has not any value. After I did partial update to
> this
> >> > document to set "read_flag"="true", I faced strange problem. Next
> time I
> >> > indexed same document with same values I saw two different version of
> >> > document with id '12345' in solr. One of them with read_flag=true and
> >> > another one without read_flag field! I dont want to have duplicate
> >> > documents (as it should not to be because of unique_key id). Would you
> >> > please tell me what caused such problem?
> >> > Best regards.
> >> >
> >> > --
> >> > A.Nazemian
> >>
> >
> >
> >
> > --
> > A.Nazemian
>



-- 
A.Nazemian


Re: duplicate unique key after partial update in solr 4.10

2014-10-06 Thread Alexandre Rafalovitch
Can you upload the update documents then (into a Gist or similar).
Just so that people didn't have to re-imagine exact steps. Because, if
it fully checks out, it might be a bug and the next step would be
creating a JIRA ticket.

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 6 October 2014 11:23, Ali Nazemian  wrote:
> Dear Alex,
> Hi,
> LOL, yeah I am sure. You can test it yourself. I did that on default schema
> too. The results are same!
> Regards.
>
> On Mon, Oct 6, 2014 at 4:20 PM, Alexandre Rafalovitch 
> wrote:
>
>> A stupid question: Are you sure that what schema thinks your uniqueId
>> is - is the uniqueId in your setup? Also, that you are not somehow
>> using the flags to tell Solr to ignore duplicates?
>>
>> Regards,
>>Alex.
>> Personal: http://www.outerthoughts.com/ and @arafalov
>> Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
>> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>>
>>
>> On 6 October 2014 03:40, Ali Nazemian  wrote:
>> > Dear all,
>> > Hi,
>> > I am going to do partial update on a field that has not any value.
>> Suppose
>> > I have a document with document id (unique key) '12345' and field
>> > "read_flag" which does not index at the first place. So the read_flag
>> field
>> > for this document has not any value. After I did partial update to this
>> > document to set "read_flag"="true", I faced strange problem. Next time I
>> > indexed same document with same values I saw two different version of
>> > document with id '12345' in solr. One of them with read_flag=true and
>> > another one without read_flag field! I dont want to have duplicate
>> > documents (as it should not to be because of unique_key id). Would you
>> > please tell me what caused such problem?
>> > Best regards.
>> >
>> > --
>> > A.Nazemian
>>
>
>
>
> --
> A.Nazemian


Re: duplicate unique key after partial update in solr 4.10

2014-10-06 Thread Ali Nazemian
Dear Alex,
Hi,
LOL, yeah I am sure. You can test it yourself. I did that on default schema
too. The results are same!
Regards.

On Mon, Oct 6, 2014 at 4:20 PM, Alexandre Rafalovitch 
wrote:

> A stupid question: Are you sure that what schema thinks your uniqueId
> is - is the uniqueId in your setup? Also, that you are not somehow
> using the flags to tell Solr to ignore duplicates?
>
> Regards,
>Alex.
> Personal: http://www.outerthoughts.com/ and @arafalov
> Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
> Solr popularizers community: https://www.linkedin.com/groups?gid=6713853
>
>
> On 6 October 2014 03:40, Ali Nazemian  wrote:
> > Dear all,
> > Hi,
> > I am going to do partial update on a field that has not any value.
> Suppose
> > I have a document with document id (unique key) '12345' and field
> > "read_flag" which does not index at the first place. So the read_flag
> field
> > for this document has not any value. After I did partial update to this
> > document to set "read_flag"="true", I faced strange problem. Next time I
> > indexed same document with same values I saw two different version of
> > document with id '12345' in solr. One of them with read_flag=true and
> > another one without read_flag field! I dont want to have duplicate
> > documents (as it should not to be because of unique_key id). Would you
> > please tell me what caused such problem?
> > Best regards.
> >
> > --
> > A.Nazemian
>



-- 
A.Nazemian


Re: duplicate unique key after partial update in solr 4.10

2014-10-06 Thread Alexandre Rafalovitch
A stupid question: Are you sure that what schema thinks your uniqueId
is - is the uniqueId in your setup? Also, that you are not somehow
using the flags to tell Solr to ignore duplicates?

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 6 October 2014 03:40, Ali Nazemian  wrote:
> Dear all,
> Hi,
> I am going to do partial update on a field that has not any value. Suppose
> I have a document with document id (unique key) '12345' and field
> "read_flag" which does not index at the first place. So the read_flag field
> for this document has not any value. After I did partial update to this
> document to set "read_flag"="true", I faced strange problem. Next time I
> indexed same document with same values I saw two different version of
> document with id '12345' in solr. One of them with read_flag=true and
> another one without read_flag field! I dont want to have duplicate
> documents (as it should not to be because of unique_key id). Would you
> please tell me what caused such problem?
> Best regards.
>
> --
> A.Nazemian


duplicate unique key after partial update in solr 4.10

2014-10-06 Thread Ali Nazemian
Dear all,
Hi,
I am going to do partial update on a field that has not any value. Suppose
I have a document with document id (unique key) '12345' and field
"read_flag" which does not index at the first place. So the read_flag field
for this document has not any value. After I did partial update to this
document to set "read_flag"="true", I faced strange problem. Next time I
indexed same document with same values I saw two different version of
document with id '12345' in solr. One of them with read_flag=true and
another one without read_flag field! I dont want to have duplicate
documents (as it should not to be because of unique_key id). Would you
please tell me what caused such problem?
Best regards.

-- 
A.Nazemian


Re: Are there any performance impact of using a non-standard length UUID as the unique key of Solr?

2014-07-24 Thread Mark Miller
Some good info on unique id’s for Lucene / Solr can be found here: 
http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html
-- 
Mark Miller
about.me/markrmiller

On July 24, 2014 at 9:51:28 PM, He haobo (haob...@gmail.com) wrote:

Hi,  

In our Solr collection (Solr 4.8), we have the following unique key  
definition.  
  

id  


In our external java program, we will generate an UUID with  
UUID.randomUUID().toString() first. Then, we will use Cryptographic hash to  
generate a 32 bytes length text and finally use it as id.  

For now, we might need to post more than 20k Solr docs per second. Then  
UUID.randomUUID() or the Cryptographic hash stuff might take time. We might  
have a simple workaround to share one Cryptographic hash stuff for many  
Solr docs. Namely, we want to append sequence to Cryptographic hash such  
as 9AD0BB6DDD7AA9FE4D9EB1FF16B3BDFY00,  
9AD0BB6DDD7AA9FE4D9EB1FF16B3BDFY01,  
9AD0BB6DDD7AA9FE4D9EB1FF16B3BDFY02, etc.  


What we want to know, if we use a 38 bytes length id, are there any  
performance impact for Solr data insert or query? Or, if we use Solr's  
default automatically generated id implementation, should it be more  
efficient?  



Thanks,  
Eternal  


Are there any performance impact of using a non-standard length UUID as the unique key of Solr?

2014-07-24 Thread He haobo
Hi,

In our Solr collection (Solr 4.8), we have the following unique key
definition.
 

 id


In our external java program, we will generate an UUID with
UUID.randomUUID().toString() first. Then, we will use Cryptographic hash to
generate a 32 bytes length text and finally use it as id.

For now, we might need to post more than 20k Solr docs per second. Then
UUID.randomUUID() or the Cryptographic hash stuff might take time. We might
have a simple workaround to share one Cryptographic hash stuff for many
Solr docs. Namely, we want to append sequence to Cryptographic hash such
as 9AD0BB6DDD7AA9FE4D9EB1FF16B3BDFY00,
9AD0BB6DDD7AA9FE4D9EB1FF16B3BDFY01,
9AD0BB6DDD7AA9FE4D9EB1FF16B3BDFY02, etc.


What we want to know, if we use a 38 bytes length id, are there any
performance impact for Solr data insert or query? Or, if we use Solr's
default automatically generated id implementation, should it be more
efficient?



Thanks,
Eternal


Re: Duplicate Unique Key

2014-04-08 Thread Simon
MergingIndex is not the case here as I am not doing that.  Even the issue is
gone for now, it is not a relief for me as I am not sure how to explain this
to others (peer, boss and user).  I am thinking of implement a watch dog to
check whenever the total Solr documents exceeds the number of items in
database, it will raise a flag so that I may do something before getting
complaints. 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Duplicate-Unique-Key-tp4129651p4129894.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Duplicate Unique Key

2014-04-08 Thread Erick Erickson
Right, this is expected behavior. The real problem isn't data loss,
but how do you know which doc should "win"? Merging indexes is for a
rather narrowly-defined use-case, it was never intended to remove
duplicates.

Best,
Erick

On Tue, Apr 8, 2014 at 12:36 AM, Cihad Guzel  wrote:
> Hi.
>
> I have encountered a similar situation  when I tested solr merge index . (
> http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201403.mbox/%3CCAMrn6cOVWohxooRzZ8NmwYQUda2GW+gYD+edvC_b_kGT=f4...@mail.gmail.com%3E
>  )
>
> I have had duplicates. But the duplicates are gone when I post same data
> for indexing. I think this was done in order to prevent data loss while
> merging index.
>
>
>
>
> 2014-04-07 23:04 GMT+03:00 Erick Erickson :
>
>> Oh my yes! I feel a great sense of relief every time an intermittent
>> problem becomes reproducible... The problem is not solved, but at
>> least I have a good feeling that once I don't see it any more it's
>> _really_ gone!
>>
>> One possibility is index merging, see:
>> https://wiki.apache.org/solr/MergingSolrIndexes. When you merge
>> indexes, there is no duplicate id checking performed, so you can well
>> have duplicates. That's a wild shot in the dark though.
>>
>> Best,
>> Erick
>>
>> On Mon, Apr 7, 2014 at 12:26 PM, Simon  wrote:
>> > Erick,
>> >
>> > It's indeed quite odd.  And after I trigger re-indexing all documents
>> (via
>> > the normal process of existing program). The duplication is gone.  It can
>> > not be reproduced easily.  But it did occur occasionally and that makes
>> it a
>> > frustrating task to troubleshoot.
>> >
>> > Thanks,
>> > Simon
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> http://lucene.472066.n3.nabble.com/Duplicate-Unique-Key-tp4129651p4129701.html
>> > Sent from the Solr - User mailing list archive at Nabble.com.
>>


Re: Duplicate Unique Key

2014-04-08 Thread Cihad Guzel
Hi.

I have encountered a similar situation  when I tested solr merge index . (
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201403.mbox/%3CCAMrn6cOVWohxooRzZ8NmwYQUda2GW+gYD+edvC_b_kGT=f4...@mail.gmail.com%3E
 )

I have had duplicates. But the duplicates are gone when I post same data
for indexing. I think this was done in order to prevent data loss while
merging index.




2014-04-07 23:04 GMT+03:00 Erick Erickson :

> Oh my yes! I feel a great sense of relief every time an intermittent
> problem becomes reproducible... The problem is not solved, but at
> least I have a good feeling that once I don't see it any more it's
> _really_ gone!
>
> One possibility is index merging, see:
> https://wiki.apache.org/solr/MergingSolrIndexes. When you merge
> indexes, there is no duplicate id checking performed, so you can well
> have duplicates. That's a wild shot in the dark though.
>
> Best,
> Erick
>
> On Mon, Apr 7, 2014 at 12:26 PM, Simon  wrote:
> > Erick,
> >
> > It's indeed quite odd.  And after I trigger re-indexing all documents
> (via
> > the normal process of existing program). The duplication is gone.  It can
> > not be reproduced easily.  But it did occur occasionally and that makes
> it a
> > frustrating task to troubleshoot.
> >
> > Thanks,
> > Simon
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Duplicate-Unique-Key-tp4129651p4129701.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: Duplicate Unique Key

2014-04-07 Thread Erick Erickson
Oh my yes! I feel a great sense of relief every time an intermittent
problem becomes reproducible... The problem is not solved, but at
least I have a good feeling that once I don't see it any more it's
_really_ gone!

One possibility is index merging, see:
https://wiki.apache.org/solr/MergingSolrIndexes. When you merge
indexes, there is no duplicate id checking performed, so you can well
have duplicates. That's a wild shot in the dark though.

Best,
Erick

On Mon, Apr 7, 2014 at 12:26 PM, Simon  wrote:
> Erick,
>
> It's indeed quite odd.  And after I trigger re-indexing all documents (via
> the normal process of existing program). The duplication is gone.  It can
> not be reproduced easily.  But it did occur occasionally and that makes it a
> frustrating task to troubleshoot.
>
> Thanks,
> Simon
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Duplicate-Unique-Key-tp4129651p4129701.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Re: Duplicate Unique Key

2014-04-07 Thread Simon
Erick,

It's indeed quite odd.  And after I trigger re-indexing all documents (via
the normal process of existing program). The duplication is gone.  It can
not be reproduced easily.  But it did occur occasionally and that makes it a
frustrating task to troubleshoot. 

Thanks,
Simon



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Duplicate-Unique-Key-tp4129651p4129701.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Duplicate Unique Key

2014-04-07 Thread Erick Erickson
Hmmm, that's odd. I just tried it (admittedly with post.jar rather
than SolrJ) and it works just fine.

what server are you using (e.g. CloudSolrServer)? And can you create a
self-contained program that illustrates the problem?

Best,
Erick

On Mon, Apr 7, 2014 at 8:50 AM, Simon  wrote:
> Hi all,
>
> I know someone has posted similar question before.  But my case is little
> different as I don't have the schema set up issue mentioned in those posts
> but still get duplicate records.
>
> My unique key in schema is
>
>  multiValued="false" required="true"/>
>
>
> id$
>
>
>
> Search on Solr- admin UI:   id$:1
>
> I got two documents
> {
>"id$": "1",
>"_version_": 1464225014071951400,
> "_root_": 1
> },
> {
> "id$": "1",
> "_version_": 1464236728284872700,
> "_root_": 1
> }
>
> I use SolrJ api to add documents.  My understanding solr uniqueKey is like a
> database primary key. I am wondering how could I end up with two documents
> with same uniqueKey in the index.
>
> Thanks,
> Simon
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Duplicate-Unique-Key-tp4129651.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Duplicate Unique Key

2014-04-07 Thread Simon
Hi all,

I know someone has posted similar question before.  But my case is little
different as I don't have the schema set up issue mentioned in those posts
but still get duplicate records.

My unique key in schema is 




id$



Search on Solr- admin UI:   id$:1

I got two documents
{
   "id$": "1",
   "_version_": 1464225014071951400,
"_root_": 1
},
{
"id$": "1",
"_version_": 1464236728284872700,
"_root_": 1
}

I use SolrJ api to add documents.  My understanding solr uniqueKey is like a
database primary key. I am wondering how could I end up with two documents
with same uniqueKey in the index.

Thanks,
Simon




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Duplicate-Unique-Key-tp4129651.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Composite Unique key from existing fields in schema

2014-02-08 Thread tamanjit.bin...@yahoo.co.in
Also take a look at 
http://wiki.apache.org/solr/UniqueKey#Use_cases_which_require_a_unique_key_generated_from_data_in_the_document
<http://wiki.apache.org/solr/UniqueKey#Use_cases_which_require_a_unique_key_generated_from_data_in_the_document>
  



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Composite-Unique-key-from-existing-fields-in-schema-tp4116036p4116198.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Composite Unique key from existing fields in schema

2014-02-08 Thread tamanjit.bin...@yahoo.co.in
You could use a combination of all your composite key columns and put them in
a field in solr which can then be used as the unique key. As in if you have
two columns c1 and c2, you could have a field in solr which have the value
as c1_c2 or something on those lines.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Composite-Unique-key-from-existing-fields-in-schema-tp4116036p4116195.html
Sent from the Solr - User mailing list archive at Nabble.com.


Solr Composite Unique key from existing fields in schema

2014-02-07 Thread Anurag Verma
Hi,
 I am developing a search application using SOLR. I don't have primary
key in any table. Composite key is being used in my application. How do i
implement composite key as unique key in this case. please help. i am
struggling.

-- 
Thanks & Regards
Anurag Verma
Arise! Awake! And stop not till the goal is reached!


Re: Unique key error while indexing pdf files

2013-07-02 Thread Shalin Shekhar Mangar
See http://wiki.apache.org/solr/DataImportHandler#FileListEntityProcessor

"The implicit fields generated by the FileListEntityProcessor are
fileDir, file, fileAbsolutePath, fileSize, fileLastModified and these
are available for use within the entity"

On Tue, Jul 2, 2013 at 2:47 PM, archit2112  wrote:
> Yes. The absolute path is unique. How do i implement it? can you please
> explain?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074638.html
> Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Regards,
Shalin Shekhar Mangar.


Re: Unique key error while indexing pdf files

2013-07-02 Thread archit2112
Yes. The absolute path is unique. How do i implement it? can you please
explain?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074638.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Removal of unique key - Query Elevation Component

2013-07-02 Thread archit2112
Thanks! The author_s issue has been resolved. 
Why are the other fields not getting indexed ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Removal-of-unique-key-Query-Elevation-Component-tp4074624p4074636.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Removal of unique key - Query Elevation Component

2013-07-02 Thread Shalin Shekhar Mangar
My guess is that you have a  element which copies the
author into an author_s field.

On Tue, Jul 2, 2013 at 2:14 PM, archit2112  wrote:
>
> I want to index pdf files in solr 4.3.0 using the data import handler.
>
> I have done the following:
>
> My request handler -
>
>  class="org.apache.solr.handler.dataimport.DataImportHandler">
> 
>   data-config.xml
> 
>   
>
> My data-config.xml
>
> 
> 
> 
>  processor="FileListEntityProcessor"
> baseDir="C:\Users\aroraarc\Desktop\Impdo" fileName=".*pdf"
> recursive="true">
>  url="${f.fileAbsolutePath}" format="text">
> 
> 
> 
> 
> 
> 
> 
>
> Now when i tried to index the documents i got the following error
>
> org.apache.solr.common.SolrException: Document is missing mandatory
> uniqueKey field: id
>
> Because i dont want any uniquekey in my case i disabled it as follows :
>
> In solrconfig.xml i commented out -
>
> 
> pick a fieldType to analyze queries
> string
> elevate.xml
>   
>
> In schema.xml i commented out id
>
> and added
>
> 
> 
>
> and in elevate.xml i made the following changes
>
> 
>  
>   
>  
> 
>
> When i do this the indexing takes place but the indexed docs contain an
> author,s_author and id field. The document should contain author,text,title
> and id field (as defined in my data-config.xml). Please help me out. Am i
> doing anything wrong? and from where did this s_author field come?
>
> 
> arora arc
> arora arc
> 4f65332d-49d9-497a-b88b-881da618f571
>
>
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Removal-of-unique-key-Query-Elevation-Component-tp4074624.html
> Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Regards,
Shalin Shekhar Mangar.


Removal of unique key - Query Elevation Component

2013-07-02 Thread archit2112

I want to index pdf files in solr 4.3.0 using the data import handler.

I have done the following:

My request handler -

  
  
  data-config.xml  
  


My data-config.xml

  
  
  
  
  



  
  
  
  

Now when i tried to index the documents i got the following error

org.apache.solr.common.SolrException: Document is missing mandatory
uniqueKey field: id

Because i dont want any uniquekey in my case i disabled it as follows :

In solrconfig.xml i commented out -


pick a fieldType to analyze queries 
string
elevate.xml
   

In schema.xml i commented out id

and added

 


and in elevate.xml i made the following changes


 
  
 
 

When i do this the indexing takes place but the indexed docs contain an
author,s_author and id field. The document should contain author,text,title
and id field (as defined in my data-config.xml). Please help me out. Am i
doing anything wrong? and from where did this s_author field come?


arora arc
arora arc
4f65332d-49d9-497a-b88b-881da618f571





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Removal-of-unique-key-Query-Elevation-Component-tp4074624.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unique key error while indexing pdf files

2013-07-02 Thread archit2112
Yes. The absolute path is unique.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074620.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unique key error while indexing pdf files

2013-07-02 Thread Shalin Shekhar Mangar
We can't tell you what the id of your own document should be. Isn't
there anything which is unique about your pdf files? How about the
file name or the absolute path?

On Tue, Jul 2, 2013 at 11:33 AM, archit2112  wrote:
> Okay. Can you please suggest a way (with an example) of assigning this unique
> key to a pdf file. Say, a unique number to each pdf file. How do i achieve
> this?
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074592.html
> Sent from the Solr - User mailing list archive at Nabble.com.



-- 
Regards,
Shalin Shekhar Mangar.


Re: Unique key error while indexing pdf files

2013-07-01 Thread archit2112
Okay. Can you please suggest a way (with an example) of assigning this unique
key to a pdf file. Say, a unique number to each pdf file. How do i achieve
this?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074592.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unique key error while indexing pdf files

2013-07-01 Thread archit2112
Can you please suggest a way (with example) of assigning this unique key to a
pdf file?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074588.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unique key error while indexing pdf files

2013-07-01 Thread Jack Krupansky
It's really 100% up to you how you want to come up with the unique key 
values for your documents. What would you like them to be? Just use that. 
Anything (within reason) - anything goes.


But it also comes back to your data model. You absolutely must come up with 
a data model for how you expect to index and query data in Solr before you 
just start throwing random data into Solr.


1. Design your data model.
2. Produce a Solr schema from that data model.
3. Map the raw data from your data sources (e.g., PDF files) to the fields 
in your Solr schema.


That last step includes the ID/key field, but your data model will imply any 
requirements for what the ID/key should be.


To be absolutely clear, it is 100% up to you to design the ID/key for every 
document; Solr does NOT do that for you.


Even if you are just "exploring", at least come up with an "exploratory" 
data model - which includes what expectations you have about the unique 
ID/key for each document.


So, for that first PDF file, what expectation (according to your data model) 
do you have for what its ID/key should be?


-- Jack Krupansky

-Original Message- 
From: archit2112

Sent: Monday, July 01, 2013 8:30 AM
To: solr-user@lucene.apache.org
Subject: Re: Unique key error while indexing pdf files

Im new to solr. Im just trying to understand and explore various features
offered by solr and their implementations. I would be very grateful if you
could solve my problem with any example of your choice. I just want to learn
how i can index pdf documents using data import handler.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074327.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Unique key error while indexing pdf files

2013-07-01 Thread archit2112
Im new to solr. Im just trying to understand and explore various features
offered by solr and their implementations. I would be very grateful if you
could solve my problem with any example of your choice. I just want to learn
how i can index pdf documents using data import handler.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314p4074327.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Unique key error while indexing pdf files

2013-07-01 Thread Jack Krupansky

It all depends on your data model - tell us more about your data model.

For example, how will users or applications query these documents and what 
will they expect to be able to do with the ID/key for the documents?


How are you expecting to identify documents in your data model?

-- Jack Krupansky

-Original Message- 
From: archit2112

Sent: Monday, July 01, 2013 7:17 AM
To: solr-user@lucene.apache.org
Subject: Unique key error while indexing pdf files

Hi

Im trying to index pdf files in solr 4.3.0 using the data import handler.

*My request handler - *


   
 data-config1.xml
   
 

*My data-config1.xml *















Now When i try and index the files i get the following error -

org.apache.solr.common.SolrException: Document is missing mandatory
uniqueKey field: id
at
org.apache.solr.update.AddUpdateCommand.getIndexedId(AddUpdateCommand.java:88)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:517)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:396)
at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:70)
at
org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:235)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:500)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:491)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:404)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:319)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:227)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:422)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:487)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:468)


This problem can be solved easily in case of database indexing but i dont
know how to go about the unique key of a document. how do i define the id
field (unique key) of a pdf file. how do i solve this problem?

Thanks in advance




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Unique key error while indexing pdf files

2013-07-01 Thread archit2112
Hi

Im trying to index pdf files in solr 4.3.0 using the data import handler. 

*My request handler - *

 
 
  data-config1.xml 
 
   

*My data-config1.xml *

 
 
 
 
 



 
 
 
 


Now When i try and index the files i get the following error -

org.apache.solr.common.SolrException: Document is missing mandatory
uniqueKey field: id
at
org.apache.solr.update.AddUpdateCommand.getIndexedId(AddUpdateCommand.java:88)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:517)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:396)
at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
at 
org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:70)
at
org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:235)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:500)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:491)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:404)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:319)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:227)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:422)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:487)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:468)


This problem can be solved easily in case of database indexing but i dont
know how to go about the unique key of a document. how do i define the id
field (unique key) of a pdf file. how do i solve this problem?

Thanks in advance




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unique-key-error-while-indexing-pdf-files-tp4074314.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Composite Unique key from existing fields in schema

2013-05-28 Thread Jack Krupansky
Great. And I did verify that the field order cannot be guaranteed by a 
single CloneFieldUpdateProcessorFactory with multiple field names - the 
underlying code iterates over the input values, checks the field selector 
for membership and then immediately adds to the output, so changing the 
input order will change the output order. Also, field names are stored in a 
HashSet anyway, which would tend to scramble their order.


-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran

Sent: Tuesday, May 28, 2013 6:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Thanks Jack, That fixed it and guarantees the order.

As far as I can tell SOLR cloud 4.2.1 needs a uniquekey defined in its 
schema, or I get an exception.

SolrCore Initialization Failures
* testCloud2_shard1_replica1: 
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
QueryElevationComponent requires the schema to have a uniqueKeyField.


Now that I have an autogenerated composite-id, it has to become a part of my 
schema as uniquekey for SOLR cloud to work.
 multiValued="false" required="true"/>
 multiValued="false" required="true"/>
multiValued="false" required="true"/>

compositeId

Is there a way to avoid compositeId field being defined in my schema.xml, 
would like to avoid the overhead of storing this field in my index.


Thanks,

Rishi.








-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 4:33 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


The TL;DR response: Try this:


 
   userid_s
   id
 
 
   docid_s
   id
 
 
   id
   --
 
 
 


That will assure that the userid gets processed before the docid.

I'll have to review the contract for CloneFieldUpdateProcessorFactory to see
what is or ain't guaranteed when there are multiple input fields - whether
this is a bug or a feature or simply undefined.

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran

Sent: Tuesday, May 28, 2013 3:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

I thought the same, but that doesn't seem to be the case.








-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 3:32 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


The order in the ID should be purely dependent on the order of the field
names in the processor configuration:

docid_s
userid_s

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran

Sent: Tuesday, May 28, 2013 2:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the
compositeId that is generated is based on input order.

For example:
If my input comes in as
1
12345

I get the following compositeId1-12345.

If I reverse the input

12345

1
I get the following compositeId 12345-1 .


In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:


 
   docid_s
   userid_s
   id
 
 
   id
   --
 
 
 


Add documents such as:

curl
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id"; \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
 "docid_s": "doc-1",
 "userid_s": "user-1",
 "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
 "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone
update processor, and pick your composite key field name as well. And set
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid,
docid).

I used the standard Solr example schema, so I used dynamic fields for the
two ids, but use your own field names.

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran

Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

 
 
docid

Wanted to change this to a composite key something like
userid-docid.
I know I can auto generate compositekey at document insert time, using
custom code to generate a new field, but wanted to know if there was an
inbuilt SOLR mechanism of doing this. That would prevent us from creating
and storing an extra field.

Thanks,

Rishi.












Re: Solr Composite Unique key from existing fields in schema

2013-05-28 Thread Rishi Easwaran
Thanks Jack, That fixed it and guarantees the order.

As far as I can tell SOLR cloud 4.2.1 needs a uniquekey defined in its schema, 
or I get an exception.
SolrCore Initialization Failures
 * testCloud2_shard1_replica1: 
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
QueryElevationComponent requires the schema to have a uniqueKeyField. 

Now that I have an autogenerated composite-id, it has to become a part of my 
schema as uniquekey for SOLR cloud to work. 
  
  
  
compositeId

Is there a way to avoid compositeId field being defined in my schema.xml, would 
like to avoid the overhead of storing this field in my index.

Thanks,

Rishi.


 

 

 

-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 4:33 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


The TL;DR response: Try this:


  
userid_s
id
  
  
docid_s
id
  
  
id
--
  
  
  


That will assure that the userid gets processed before the docid.

I'll have to review the contract for CloneFieldUpdateProcessorFactory to see 
what is or ain't guaranteed when there are multiple input fields - whether 
this is a bug or a feature or simply undefined.

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 3:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

I thought the same, but that doesn't seem to be the case.








-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 3:32 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


The order in the ID should be purely dependent on the order of the field
names in the processor configuration:

docid_s
userid_s

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 2:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the
compositeId that is generated is based on input order.

For example:
If my input comes in as
1
12345

I get the following compositeId1-12345.

If I reverse the input

12345

1
I get the following compositeId 12345-1 .


In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:


  
docid_s
userid_s
id
  
  
id
--
  
  
  


Add documents such as:

curl
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id"; \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone
update processor, and pick your composite key field name as well. And set
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid,
docid).

I used the standard Solr example schema, so I used dynamic fields for the
two ids, but use your own field names.

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  
  
docid

Wanted to change this to a composite key something like
userid-docid.
I know I can auto generate compositekey at document insert time, using
custom code to generate a new field, but wanted to know if there was an
inbuilt SOLR mechanism of doing this. That would prevent us from creating
and storing an extra field.

Thanks,

Rishi.









 


Re: Solr Composite Unique key from existing fields in schema

2013-05-28 Thread Jack Krupansky

The TL;DR response: Try this:


 
   userid_s
   id
 
 
   docid_s
   id
 
 
   id
   --
 
 
 


That will assure that the userid gets processed before the docid.

I'll have to review the contract for CloneFieldUpdateProcessorFactory to see 
what is or ain't guaranteed when there are multiple input fields - whether 
this is a bug or a feature or simply undefined.


-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran

Sent: Tuesday, May 28, 2013 3:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

I thought the same, but that doesn't seem to be the case.








-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 3:32 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


The order in the ID should be purely dependent on the order of the field
names in the processor configuration:

docid_s
userid_s

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran

Sent: Tuesday, May 28, 2013 2:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the
compositeId that is generated is based on input order.

For example:
If my input comes in as
1
12345

I get the following compositeId1-12345.

If I reverse the input

12345

1
I get the following compositeId 12345-1 .


In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:


 
   docid_s
   userid_s
   id
 
 
   id
   --
 
 
 


Add documents such as:

curl
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id"; \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
 "docid_s": "doc-1",
 "userid_s": "user-1",
 "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
 "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone
update processor, and pick your composite key field name as well. And set
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid,
docid).

I used the standard Solr example schema, so I used dynamic fields for the
two ids, but use your own field names.

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran

Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

 
 
docid

Wanted to change this to a composite key something like
userid-docid.
I know I can auto generate compositekey at document insert time, using
custom code to generate a new field, but wanted to know if there was an
inbuilt SOLR mechanism of doing this. That would prevent us from creating
and storing an extra field.

Thanks,

Rishi.










Re: Solr Composite Unique key from existing fields in schema

2013-05-28 Thread Rishi Easwaran
I thought the same, but that doesn't seem to be the case.


 

 

 

-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 3:32 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


The order in the ID should be purely dependent on the order of the field 
names in the processor configuration:

docid_s
userid_s

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 2:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the 
compositeId that is generated is based on input order.

For example:
If my input comes in as
1
12345

I get the following compositeId1-12345.

If I reverse the input

12345

1
I get the following compositeId 12345-1 .


In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:


  
docid_s
userid_s
id
  
  
id
--
  
  
  


Add documents such as:

curl
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id"; \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone
update processor, and pick your composite key field name as well. And set
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid,
docid).

I used the standard Solr example schema, so I used dynamic fields for the
two ids, but use your own field names.

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  
  
docid

Wanted to change this to a composite key something like
userid-docid.
I know I can auto generate compositekey at document insert time, using
custom code to generate a new field, but wanted to know if there was an
inbuilt SOLR mechanism of doing this. That would prevent us from creating
and storing an extra field.

Thanks,

Rishi.







 


Re: Solr Composite Unique key from existing fields in schema

2013-05-28 Thread Jack Krupansky
The order in the ID should be purely dependent on the order of the field 
names in the processor configuration:


docid_s
userid_s

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran

Sent: Tuesday, May 28, 2013 2:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the 
compositeId that is generated is based on input order.


For example:
If my input comes in as
1
12345

I get the following compositeId1-12345.

If I reverse the input

12345

1
I get the following compositeId 12345-1 .


In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:


 
   docid_s
   userid_s
   id
 
 
   id
   --
 
 
 


Add documents such as:

curl
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id"; \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
 "docid_s": "doc-1",
 "userid_s": "user-1",
 "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
 "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone
update processor, and pick your composite key field name as well. And set
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid,
docid).

I used the standard Solr example schema, so I used dynamic fields for the
two ids, but use your own field names.

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran

Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

 
 
docid

Wanted to change this to a composite key something like
userid-docid.
I know I can auto generate compositekey at document insert time, using
custom code to generate a new field, but wanted to know if there was an
inbuilt SOLR mechanism of doing this. That would prevent us from creating
and storing an extra field.

Thanks,

Rishi.








Re: Solr Composite Unique key from existing fields in schema

2013-05-28 Thread Rishi Easwaran
Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the 
compositeId that is generated is based on input order.

For example: 
If my input comes in as 
1
12345

 I get the following compositeId1-12345. 

If I reverse the input 

12345

1
I get the following compositeId 12345-1 . 
 

In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:


  
docid_s
userid_s
id
  
  
id
--
  
  
  


Add documents such as:

curl 
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id"; \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone 
update processor, and pick your composite key field name as well. And set 
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid, 
docid).

I used the standard Solr example schema, so I used dynamic fields for the 
two ids, but use your own field names.

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  
  
docid

Wanted to change this to a composite key something like 
userid-docid.
I know I can auto generate compositekey at document insert time, using 
custom code to generate a new field, but wanted to know if there was an 
inbuilt SOLR mechanism of doing this. That would prevent us from creating 
and storing an extra field.

Thanks,

Rishi.





 


Re: Solr Composite Unique key from existing fields in schema

2013-05-28 Thread Rishi Easwaran
Thanks Jack, looks like that will do the trick from me. I will try it out. 

 

 

 

-Original Message-
From: Jack Krupansky 
To: solr-user 
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:


  
docid_s
userid_s
id
  
  
id
--
  
  
  


Add documents such as:

curl 
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id"; \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone 
update processor, and pick your composite key field name as well. And set 
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid, 
docid).

I used the standard Solr example schema, so I used dynamic fields for the 
two ids, but use your own field names.

-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  
  
docid

Wanted to change this to a composite key something like 
userid-docid.
I know I can auto generate compositekey at document insert time, using 
custom code to generate a new field, but wanted to know if there was an 
inbuilt SOLR mechanism of doing this. That would prevent us from creating 
and storing an extra field.

Thanks,

Rishi.





 


Re: Solr Composite Unique key from existing fields in schema

2013-05-28 Thread Jack Krupansky

You can do this by combining the builtin update processors.

Add this to your solrconfig:


 
   docid_s
   userid_s
   id
 
 
   id
   --
 
 
 


Add documents such as:

curl 
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id"; \

-H 'Content-type:application/json' -d '
[{"title": "Hello World",
 "docid_s": "doc-1",
 "userid_s": "user-1",
 "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
 "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone 
update processor, and pick your composite key field name as well. And set 
the delimiter string as well in the concat update processor.


I managed to reverse the field order from what you requested (userid, 
docid).


I used the standard Solr example schema, so I used dynamic fields for the 
two ids, but use your own field names.


-- Jack Krupansky

-Original Message- 
From: Rishi Easwaran

Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

 multiValued="false" required="true"/>
 multiValued="false" required="true"/>

docid

Wanted to change this to a composite key something like 
userid-docid.
I know I can auto generate compositekey at document insert time, using 
custom code to generate a new field, but wanted to know if there was an 
inbuilt SOLR mechanism of doing this. That would prevent us from creating 
and storing an extra field.


Thanks,

Rishi.






Re: Solr Composite Unique key from existing fields in schema

2013-05-28 Thread Jan Høydahl
The cleanest is to do this from the outside.

Alternatively, it will perhaps work to populate your uniqueKey in a custom 
UpdateProcessor. You can try.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

28. mai 2013 kl. 17:12 skrev Rishi Easwaran :

> Hi All,
> 
> Historically we have used a single field in our schema as a uniqueKey.
> 
>   multiValued="false" required="true"/>
>   multiValued="false" required="true"/> 
> docid
> 
> Wanted to change this to a composite key something like 
> userid-docid.
> I know I can auto generate compositekey at document insert time, using custom 
> code to generate a new field, but wanted to know if there was an inbuilt SOLR 
> mechanism of doing this. That would prevent us from creating and storing an 
> extra field.
> 
> Thanks,
> 
> Rishi.
> 
> 
> 
> 



Solr Composite Unique key from existing fields in schema

2013-05-28 Thread Rishi Easwaran
Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  
   
docid

Wanted to change this to a composite key something like 
userid-docid.
I know I can auto generate compositekey at document insert time, using custom 
code to generate a new field, but wanted to know if there was an inbuilt SOLR 
mechanism of doing this. That would prevent us from creating and storing an 
extra field.

Thanks,

Rishi.






Re: Solr unique key can't be blank

2012-09-12 Thread Ahmet Arslan
> Thank you Ahmet! In fact, I did not know that the
> updateRequestProcessorChain needed to be defined in
> solrconfig.xml and
> I had tried to define it in schema.xml. I don't have access
> to
> solrconfig.xml (I am using Websolr) but I will contact them
> about
> adding it.

Please not that you need to reference it to UpdateRequestHander that you are 
using. (this can be extracting, dataimport etc)

  


   
 uuid
  
  



Re: Solr unique key can't be blank

2012-09-12 Thread Jack Krupansky
The UniqueKey wiki was recently updated to indicate this new Solr 4.0 
requirement:


http://wiki.apache.org/solr/UniqueKey

"in Solr 4, this field must be populated via 
solr.UUIDUpdateProcessorFactory"


The changes you were given are contained on that updated wiki page.

-- Jack Krupansky

-Original Message- 
From: Dotan Cohen

Sent: Wednesday, September 12, 2012 10:43 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr unique key can't be blank

On Wed, Sep 12, 2012 at 5:27 PM, Ahmet Arslan  wrote:

Hi Dotan,

Did you define the following update processor chain in solrconfig.xml ?
And did you reference it in an update handler?



  id






Thank you Ahmet! In fact, I did not know that the
updateRequestProcessorChain needed to be defined in solrconfig.xml and
I had tried to define it in schema.xml. I don't have access to
solrconfig.xml (I am using Websolr) but I will contact them about
adding it.

Thank you.

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com 



Re: Solr unique key can't be blank

2012-09-12 Thread Dotan Cohen
On Wed, Sep 12, 2012 at 5:27 PM, Ahmet Arslan  wrote:
> Hi Dotan,
>
> Did you define the following update processor chain in solrconfig.xml ?
> And did you reference it in an update handler?
>
> 
> 
>   id
> 
> 
> 
>

Thank you Ahmet! In fact, I did not know that the
updateRequestProcessorChain needed to be defined in solrconfig.xml and
I had tried to define it in schema.xml. I don't have access to
solrconfig.xml (I am using Websolr) but I will contact them about
adding it.

Thank you.

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com


Re: Solr unique key can't be blank

2012-09-12 Thread Ahmet Arslan


--- On Wed, 9/12/12, Dotan Cohen  wrote:

> From: Dotan Cohen 
> Subject: Solr unique key can't be blank
> To: solr-user@lucene.apache.org
> Date: Wednesday, September 12, 2012, 5:06 PM
> Consider this simple schema:
> 
> 
> 
>     
>          name="uuid" class="solr.UUIDField" indexed="true" />
>     
>     
>          type="uuid" indexed="true" stored="true"
> required="true"/>
>     
> 
> 
> When trying to upload it to Websolr I am getting this
> error:
> Solr unique key can't be blank
> 
> I also tried adding this element to the XML, after
> :
> id
> 
> However this did not help. What could be the issue? I The
> code is
> taken verbatim from this page:
> http://wiki.apache.org/solr/UniqueKey
> 
> Note that this is on a Solr 4 Alpha index. Thanks.

Hi Dotan,

Did you define the following update processor chain in solrconfig.xml ?
And did you reference it in an update handler?
 


  id






Re: Solr - Unique Key Field Should Apply on q search or fq search

2012-08-24 Thread Jack Krupansky
A query such as "q=myTextFeild:politics programme" will search for 
"programme" in the default search field, which may not have any hits. An 
explicit field name applies to only the immediately successive term or 
parenthesized sub-query.


The second and third queries work because the default operator is "OR", so 
it doesn't matter that "programme" can't be found.


Maybe you meant "q=myTextFeild:(politics programme)"

Or, actually, "q=myTextFeild:(politics AND programme)" or 
"q=myTextFeild:(+politics +programme)"


-- Jack Krupansky

-Original Message- 
From: meghana

Sent: Friday, August 24, 2012 7:54 AM
To: solr-user@lucene.apache.org
Subject: Solr - Unique Key Field Should Apply on q search or fq search

I am right now applying unique key field search on q search. but sometimes 
it

occurs issue with text search ,

For. e. g. if i search with below url, then it return results me as 0 rows ,
where as such record exist.

http://localhost:8080/solr/core0/select?q=myTextFeild:politics programme AND
myuniquekey:193834

but if i modify my search with any of below mentioned search query it works
properly.

http://localhost:8080/solr/core0/select?q=myuniquekey:193834 AND
myTextFeild:politics programme

OR

http://localhost:8080/solr/core0/select?q=myTextFeild:politics
programme&fq=myuniquekey:193834

Now i don't know which would be better option to apply , should i apply
unique key on query or in filter query

Please Suggest.
Thanks






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Unique-Key-Field-Should-Apply-on-q-search-or-fq-search-tp4003066.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Solr - Unique Key Field Should Apply on q search or fq search

2012-08-24 Thread Ahmet Arslan
> For. e. g. if i search with below url, then it return
> results me as 0 rows ,
> where as such record exist.
> 
> http://localhost:8080/solr/core0/select?q=myTextFeild:politics
> programme AND
> myuniquekey:193834
> 
> but if i modify my search with any of below mentioned search
> query it works
> properly.
> 
> http://localhost:8080/solr/core0/select?q=myuniquekey:193834
> AND
> myTextFeild:politics programme
> 
> OR 
> 
> http://localhost:8080/solr/core0/select?q=myTextFeild:politics
> programme&fq=myuniquekey:193834
> 
> Now i don't know which would be better option to apply ,
> should i apply
> unique key on query or in filter query


myTextFeild:politics programme is parsed as follows :

myTextFeild:politics defaultField:programme

You should use parenthesis : 
q=myTextFeild:(politics programme) AND myuniquekey:193834

Filter queries are cached, if you will be re-using same uniqueKey it is better 
to use fq.



Re: unique key

2012-07-10 Thread Tomás Fernández Löbbe
No, a unique key needs to be indexed. You can delete documents by query (to
avoid duplication), but you can't query on any field that is not indexed,
so I guess you'll need it.

On Tue, Jul 10, 2012 at 12:23 PM, Sachin Aggarwal <
different.sac...@gmail.com> wrote:

> today i experimented some parameters with apache-solr-4.0.0-ALPHA
>
> example record :-
> id,u,u1,u2,u3,u4,u5,u6,u7,u8,u9
> df8caf0b-a0b2-4cc8-8594-e9b17026f126, ARGUE, COUDE, FLONG, JIVER, MALAR,
> PARVO, SARIS, ULANS, VIRLSA, CHATEAUBRIAND
>
>
> for 5000 index dir size is when unique key is not indexed:::596.19 KB
> for 5000 index dir size is when unique key is not indexed:::399.98 KB
>
> for 5005000 index dir size is when unique key is not indexed:333.73 MB
> for 5005000 index dir size is when unique key is indexed:::520.46
> MB
>
> for 1001 index dir size is when unique key is not indexed:667.42 MB
> for 1001 index dir size is when unique key is indexed:::1.02 GB
>
> there is a significant decrease in index size but it costs me duplication
> of record if the job fails.thats too dangerous is there any way where
> we can make sure that solr maintain unique key but don't index it.
>


Re: unique key

2012-07-10 Thread Tomás Fernández Löbbe
There are some specific use cases where you can skip having a unique key.
See http://wiki.apache.org/solr/UniqueKey
However, I would test how much space you save by not having one.

On Tue, Jul 10, 2012 at 6:27 AM, Sachin Aggarwal  wrote:

> in my use case i m not deleting any doc from solr i m using batch build on
> data and use solr as filters on data data is very large raw rows are in
> billions and filtered or searched query are in millions...is there any way
> to leave unique key from indexing
>
> On Tue, Jul 10, 2012 at 3:42 PM, Tomás Fernández Löbbe <
> tomasflo...@gmail.com> wrote:
>
> > In order to support updates (which are treated as deleted + add), the
> > unique key needs to be indexed.
> >
> > Tomás
> >
> > On Tue, Jul 10, 2012 at 6:08 AM, Sachin Aggarwal <
> > different.sac...@gmail.com
> > > wrote:
> >
> > > is it possible not to index and but store the unique keyit will
> make
> > > index size small. i need the unique key to b stored so that i can read
> > the
> > > returned result from the database.
> > >
> > > --
> > >
> > > Thanks & Regards
> > >
> > > Sachin Aggarwal
> > > 7760502772
> > >
> >
>
>
>
> --
>
> Thanks & Regards
>
> Sachin Aggarwal
> 7760502772
>


Re: unique key

2012-07-10 Thread Sachin Aggarwal
in my use case i m not deleting any doc from solr i m using batch build on
data and use solr as filters on data data is very large raw rows are in
billions and filtered or searched query are in millions...is there any way
to leave unique key from indexing

On Tue, Jul 10, 2012 at 3:42 PM, Tomás Fernández Löbbe <
tomasflo...@gmail.com> wrote:

> In order to support updates (which are treated as deleted + add), the
> unique key needs to be indexed.
>
> Tomás
>
> On Tue, Jul 10, 2012 at 6:08 AM, Sachin Aggarwal <
> different.sac...@gmail.com
> > wrote:
>
> > is it possible not to index and but store the unique keyit will make
> > index size small. i need the unique key to b stored so that i can read
> the
> > returned result from the database.
> >
> > --
> >
> > Thanks & Regards
> >
> > Sachin Aggarwal
> > 7760502772
> >
>



-- 

Thanks & Regards

Sachin Aggarwal
7760502772


Re: unique key

2012-07-10 Thread Tomás Fernández Löbbe
In order to support updates (which are treated as deleted + add), the
unique key needs to be indexed.

Tomás

On Tue, Jul 10, 2012 at 6:08 AM, Sachin Aggarwal  wrote:

> is it possible not to index and but store the unique keyit will make
> index size small. i need the unique key to b stored so that i can read the
> returned result from the database.
>
> --
>
> Thanks & Regards
>
> Sachin Aggarwal
> 7760502772
>


Re: Duplicate documents being added even with unique key

2012-05-21 Thread Parmeley, Michael
Changing my field type to string for my uniquekey field solved the problem. 
Thanks to Jack and Erik for the fix!

On May 18, 2012, at 5:33 PM, Jack Krupansky wrote:

> Typically the uniqueKey field is a "string" field type (your schema uses 
> "text_general"), although I don't think it is supposed to be a requirement. 
> Still, it is one thing that stands out.
> 
> Actually, you may be running into some variation of SOLR-1401:
> 
> https://issues.apache.org/jira/browse/SOLR-1401
> 
> In other words, stick with "string" and stay away from a tokenized (text) 
> key.
> 
> You could also get duplicates by merging cores or if your "add" has 
> allowDups = "true" or overwrite="false".
> 
> -- Jack Krupansky
> 
> -Original Message- 
> From: Parmeley, Michael
> Sent: Friday, May 18, 2012 5:50 PM
> To: solr-user@lucene.apache.org
> Subject: Duplicate documents being added even with unique key
> 
> I have a uniquekey set in my schema; however, I am still getting duplicated 
> documents added. Can anyone provide any insight into why this may be 
> happening?
> 
> This is in my schema.xml:
> 
> 
> uniquekey
> 
>required="true" />
> 
> On startup I get this message in catalina.out:
> 
> INFO: unique key field: uniquekey
> 
> However, you can see I get multiple documents:
> 
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
> 
> PSR3
> 1
> Skill
> 510
> Body and Soul
> 1
> 281
> Skill510
> 
>  
> 



  1   2   >