Re: Compound Primary Keys

2019-04-24 Thread Vivekanand
Thanks a lot. Appreciate the pointers about indexing engine vs database .
I ended up using concatenating the fields to generate a key.


On Wed, Apr 24, 2019 at 11:39 AM David Hastings <
hastings.recurs...@gmail.com> wrote:

> another thing to consider doing is just merge the two fields into the id
> value:
> "id": "USER_RECORD_12334",
> since its a string.
>
>
>
> On Wed, Apr 24, 2019 at 2:35 PM Gus Heck  wrote:
>
> > Hi Vivek
> >
> > Solr is not a database, nor should one try to use it as such. You'll need
> > to adjust your thinking some in order to make good use of Solr. In Solr
> > there is normally an id field and it should be unique across EVERY
> document
> > in the entire collection. Thus there's no concept of a primary key,
> because
> > there are no tables. In some situations (streaming expressions for
> example)
> > you might want to use collections like tables, creating a collection per
> > data type, but there's no way to define uniqueness in terms of more than
> > one field within a collection. If your data comes from a database with
> > complex keys, concatenating the values to form the single unique ID is a
> > possibility. If you form keys that way of course you also want to retain
> > the values as individual fields. This duplication might seem odd from a
> > database perspective where one often works hard to normalize data, but
> for
> > search, denormalization is very common. The focus with search engines is
> > usually speed of retrieval rather than data correctness. Solr should
> serve
> > as an index into some other canonical source of truth for your data, and
> > that source of truth should be in charge of guaranteeing data
> correctness.
> >
> > Another alternative is to provide a field that denotes the type (table)
> for
> > the document (such as id_type in your example). In that case, all queries
> > looking for a specific object type as a result should add a filter (fq
> > parameter) to denote the "table" and you may want to store a db_id field
> to
> > correlate the data with a database if that's where it came from. When
> using
> > the field/filter strategy you tend to inflate the number of fields in the
> > index with some fields being sparsely populated and this can have some
> > performance implications, and furthermore if one "table" gets updated
> > frequently you wind up interfering with the caching for all data due to
> > frequent opening of new searchers. On the plus side such a strategy makes
> > it easier to query across multiple types simultaneously, so these
> > considerations should be balanced against your usage patterns,
> performance
> > needs, ease of management and ease of programming.
> >
> > Best,
> > Gus
> >
> > On Fri, Apr 19, 2019 at 2:10 PM Vivekanand Sahay
> >  wrote:
> >
> > > Hello,
> > >
> > > I have a use case like below.
> > >
> > > USE CASE
> > > I have a document with fields like
> > >
> > > Id,
> > > Id_type,
> > > Field_1.
> > > Filed_2
> > >
> > > 2 sample messages will look like
> > >
> > > {
> > >   "id": "12334",
> > >   "id_type": "USER_RECORD",
> > >   "field_1": null,
> > >   "field_2": null
> > > }
> > >
> > >
> > > {
> > >   "id": "31321",
> > >   "id_type": "OWNER_RECORD",
> > >   "field_1": null,
> > >   "field_2": null
> > > }
> > >
> > >
> > > QUESTIONS
> > >
> > > I’d like to define the unique key as a compound key from fields id and
> > > id_type
> > >
> > >   1.  Could someone give me an example of how to do this ? Or point to
> > the
> > > relevant section in the docs?
> > >   2.  Is this the best way to define a compound primary key ? Is there
> a
> > > more efficient way ?
> > >
> > > Regards,
> > > Vivek
> > >
> >
> >
> > --
> > http://www.the111shift.com
> >
>


Re: Compound Primary Keys

2019-04-24 Thread David Hastings
another thing to consider doing is just merge the two fields into the id
value:
"id": "USER_RECORD_12334",
since its a string.



On Wed, Apr 24, 2019 at 2:35 PM Gus Heck  wrote:

> Hi Vivek
>
> Solr is not a database, nor should one try to use it as such. You'll need
> to adjust your thinking some in order to make good use of Solr. In Solr
> there is normally an id field and it should be unique across EVERY document
> in the entire collection. Thus there's no concept of a primary key, because
> there are no tables. In some situations (streaming expressions for example)
> you might want to use collections like tables, creating a collection per
> data type, but there's no way to define uniqueness in terms of more than
> one field within a collection. If your data comes from a database with
> complex keys, concatenating the values to form the single unique ID is a
> possibility. If you form keys that way of course you also want to retain
> the values as individual fields. This duplication might seem odd from a
> database perspective where one often works hard to normalize data, but for
> search, denormalization is very common. The focus with search engines is
> usually speed of retrieval rather than data correctness. Solr should serve
> as an index into some other canonical source of truth for your data, and
> that source of truth should be in charge of guaranteeing data correctness.
>
> Another alternative is to provide a field that denotes the type (table) for
> the document (such as id_type in your example). In that case, all queries
> looking for a specific object type as a result should add a filter (fq
> parameter) to denote the "table" and you may want to store a db_id field to
> correlate the data with a database if that's where it came from. When using
> the field/filter strategy you tend to inflate the number of fields in the
> index with some fields being sparsely populated and this can have some
> performance implications, and furthermore if one "table" gets updated
> frequently you wind up interfering with the caching for all data due to
> frequent opening of new searchers. On the plus side such a strategy makes
> it easier to query across multiple types simultaneously, so these
> considerations should be balanced against your usage patterns, performance
> needs, ease of management and ease of programming.
>
> Best,
> Gus
>
> On Fri, Apr 19, 2019 at 2:10 PM Vivekanand Sahay
>  wrote:
>
> > Hello,
> >
> > I have a use case like below.
> >
> > USE CASE
> > I have a document with fields like
> >
> > Id,
> > Id_type,
> > Field_1.
> > Filed_2
> >
> > 2 sample messages will look like
> >
> > {
> >   "id": "12334",
> >   "id_type": "USER_RECORD",
> >   "field_1": null,
> >   "field_2": null
> > }
> >
> >
> > {
> >   "id": "31321",
> >   "id_type": "OWNER_RECORD",
> >   "field_1": null,
> >   "field_2": null
> > }
> >
> >
> > QUESTIONS
> >
> > I’d like to define the unique key as a compound key from fields id and
> > id_type
> >
> >   1.  Could someone give me an example of how to do this ? Or point to
> the
> > relevant section in the docs?
> >   2.  Is this the best way to define a compound primary key ? Is there a
> > more efficient way ?
> >
> > Regards,
> > Vivek
> >
>
>
> --
> http://www.the111shift.com
>


Re: Compound Primary Keys

2019-04-24 Thread Gus Heck
Hi Vivek

Solr is not a database, nor should one try to use it as such. You'll need
to adjust your thinking some in order to make good use of Solr. In Solr
there is normally an id field and it should be unique across EVERY document
in the entire collection. Thus there's no concept of a primary key, because
there are no tables. In some situations (streaming expressions for example)
you might want to use collections like tables, creating a collection per
data type, but there's no way to define uniqueness in terms of more than
one field within a collection. If your data comes from a database with
complex keys, concatenating the values to form the single unique ID is a
possibility. If you form keys that way of course you also want to retain
the values as individual fields. This duplication might seem odd from a
database perspective where one often works hard to normalize data, but for
search, denormalization is very common. The focus with search engines is
usually speed of retrieval rather than data correctness. Solr should serve
as an index into some other canonical source of truth for your data, and
that source of truth should be in charge of guaranteeing data correctness.

Another alternative is to provide a field that denotes the type (table) for
the document (such as id_type in your example). In that case, all queries
looking for a specific object type as a result should add a filter (fq
parameter) to denote the "table" and you may want to store a db_id field to
correlate the data with a database if that's where it came from. When using
the field/filter strategy you tend to inflate the number of fields in the
index with some fields being sparsely populated and this can have some
performance implications, and furthermore if one "table" gets updated
frequently you wind up interfering with the caching for all data due to
frequent opening of new searchers. On the plus side such a strategy makes
it easier to query across multiple types simultaneously, so these
considerations should be balanced against your usage patterns, performance
needs, ease of management and ease of programming.

Best,
Gus

On Fri, Apr 19, 2019 at 2:10 PM Vivekanand Sahay
 wrote:

> Hello,
>
> I have a use case like below.
>
> USE CASE
> I have a document with fields like
>
> Id,
> Id_type,
> Field_1.
> Filed_2
>
> 2 sample messages will look like
>
> {
>   "id": "12334",
>   "id_type": "USER_RECORD",
>   "field_1": null,
>   "field_2": null
> }
>
>
> {
>   "id": "31321",
>   "id_type": "OWNER_RECORD",
>   "field_1": null,
>   "field_2": null
> }
>
>
> QUESTIONS
>
> I’d like to define the unique key as a compound key from fields id and
> id_type
>
>   1.  Could someone give me an example of how to do this ? Or point to the
> relevant section in the docs?
>   2.  Is this the best way to define a compound primary key ? Is there a
> more efficient way ?
>
> Regards,
> Vivek
>


-- 
http://www.the111shift.com


Re: Compound Primary Keys

2019-04-19 Thread Vivekanand
Thanks

On Fri, Apr 19, 2019 at 7:58 PM Erick Erickson 
wrote:

> Yep. There’s no euqivalent of an RDBMSs composite key in Solr OOB.
>
> > On Apr 19, 2019, at 4:28 PM, Vivekanand  wrote:
> >
> > When you say roll your own , you mean , create a single field by
> > concatenation so that the result is unique ? Like USER_RECORD_12334 ?
> >
> > On Friday, April 19, 2019, Erick Erickson 
> wrote:
> >
> >> Basically you have to roll your own. You could do this when you assemble
> >> the document on the client or use an UpdateRequestProcessor. If the
> latter,
> >> by very, very sure you get it in the right place, specifically _before_
> the
> >> doc is routed.
> >>
> >> But I’d just assemble it on the client when I created the doc.
> >>
> >> Best,
> >> Erick
> >>
> >>> On Apr 19, 2019, at 10:40 AM, Vivekanand  wrote:
> >>>
> >>> Hello,
> >>>
> >>>
> >>>
> >>> I have a use case like below.
> >>>
> >>>
> >>>
> >>> *USE CASE*
> >>>
> >>> I have a document with fields like
> >>>
> >>>
> >>>
> >>> Id,
> >>>
> >>> Id_type,
> >>>
> >>> Field_1.
> >>>
> >>> Filed_2
> >>>
> >>>
> >>>
> >>> 2 sample messages will look like
> >>>
> >>>
> >>>
> >>> {
> >>>
> >>> "id": "12334",
> >>>
> >>> "id_type": "USER_RECORD",
> >>>
> >>> "field_1": null,
> >>>
> >>> "field_2": null
> >>>
> >>> }
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> {
> >>>
> >>> "id": "31321",
> >>>
> >>> "id_type": "OWNER_RECORD",
> >>>
> >>> "field_1": null,
> >>>
> >>> "field_2": null
> >>>
> >>> }
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> *QUESTIONS*
> >>>
> >>>
> >>>
> >>> I’d like to define the unique key as a compound key from fields *id*
> and
> >>> *id_type*
> >>>
> >>>  1. Could someone give me an example of how to do this ? Or point to
> the
> >>>  relevant section in the docs?
> >>>  2. Is this the best way to define a compound primary key ? Is there a
> >>>  more efficient way ?
> >>>
> >>>
> >>>
> >>> *Regards,*
> >>>
> >>> *Vivek*
> >>
> >>
>
>


Re: Compound Primary Keys

2019-04-19 Thread Erick Erickson
Yep. There’s no euqivalent of an RDBMSs composite key in Solr OOB.

> On Apr 19, 2019, at 4:28 PM, Vivekanand  wrote:
> 
> When you say roll your own , you mean , create a single field by
> concatenation so that the result is unique ? Like USER_RECORD_12334 ?
> 
> On Friday, April 19, 2019, Erick Erickson  wrote:
> 
>> Basically you have to roll your own. You could do this when you assemble
>> the document on the client or use an UpdateRequestProcessor. If the latter,
>> by very, very sure you get it in the right place, specifically _before_ the
>> doc is routed.
>> 
>> But I’d just assemble it on the client when I created the doc.
>> 
>> Best,
>> Erick
>> 
>>> On Apr 19, 2019, at 10:40 AM, Vivekanand  wrote:
>>> 
>>> Hello,
>>> 
>>> 
>>> 
>>> I have a use case like below.
>>> 
>>> 
>>> 
>>> *USE CASE*
>>> 
>>> I have a document with fields like
>>> 
>>> 
>>> 
>>> Id,
>>> 
>>> Id_type,
>>> 
>>> Field_1.
>>> 
>>> Filed_2
>>> 
>>> 
>>> 
>>> 2 sample messages will look like
>>> 
>>> 
>>> 
>>> {
>>> 
>>> "id": "12334",
>>> 
>>> "id_type": "USER_RECORD",
>>> 
>>> "field_1": null,
>>> 
>>> "field_2": null
>>> 
>>> }
>>> 
>>> 
>>> 
>>> 
>>> 
>>> {
>>> 
>>> "id": "31321",
>>> 
>>> "id_type": "OWNER_RECORD",
>>> 
>>> "field_1": null,
>>> 
>>> "field_2": null
>>> 
>>> }
>>> 
>>> 
>>> 
>>> 
>>> 
>>> *QUESTIONS*
>>> 
>>> 
>>> 
>>> I’d like to define the unique key as a compound key from fields *id* and
>>> *id_type*
>>> 
>>>  1. Could someone give me an example of how to do this ? Or point to the
>>>  relevant section in the docs?
>>>  2. Is this the best way to define a compound primary key ? Is there a
>>>  more efficient way ?
>>> 
>>> 
>>> 
>>> *Regards,*
>>> 
>>> *Vivek*
>> 
>> 



Re: Compound Primary Keys

2019-04-19 Thread Vivekanand
When you say roll your own , you mean , create a single field by
concatenation so that the result is unique ? Like USER_RECORD_12334 ?

On Friday, April 19, 2019, Erick Erickson  wrote:

> Basically you have to roll your own. You could do this when you assemble
> the document on the client or use an UpdateRequestProcessor. If the latter,
> by very, very sure you get it in the right place, specifically _before_ the
> doc is routed.
>
> But I’d just assemble it on the client when I created the doc.
>
> Best,
> Erick
>
> > On Apr 19, 2019, at 10:40 AM, Vivekanand  wrote:
> >
> > Hello,
> >
> >
> >
> > I have a use case like below.
> >
> >
> >
> > *USE CASE*
> >
> > I have a document with fields like
> >
> >
> >
> > Id,
> >
> > Id_type,
> >
> > Field_1.
> >
> > Filed_2
> >
> >
> >
> > 2 sample messages will look like
> >
> >
> >
> > {
> >
> >  "id": "12334",
> >
> >  "id_type": "USER_RECORD",
> >
> >  "field_1": null,
> >
> >  "field_2": null
> >
> > }
> >
> >
> >
> >
> >
> > {
> >
> >  "id": "31321",
> >
> >  "id_type": "OWNER_RECORD",
> >
> >  "field_1": null,
> >
> >  "field_2": null
> >
> > }
> >
> >
> >
> >
> >
> > *QUESTIONS*
> >
> >
> >
> > I’d like to define the unique key as a compound key from fields *id* and
> > *id_type*
> >
> >   1. Could someone give me an example of how to do this ? Or point to the
> >   relevant section in the docs?
> >   2. Is this the best way to define a compound primary key ? Is there a
> >   more efficient way ?
> >
> >
> >
> > *Regards,*
> >
> > *Vivek*
>
>


Re: Compound Primary Keys

2019-04-19 Thread Erick Erickson
Basically you have to roll your own. You could do this when you assemble the 
document on the client or use an UpdateRequestProcessor. If the latter, by 
very, very sure you get it in the right place, specifically _before_ the doc is 
routed.

But I’d just assemble it on the client when I created the doc.

Best,
Erick

> On Apr 19, 2019, at 10:40 AM, Vivekanand  wrote:
> 
> Hello,
> 
> 
> 
> I have a use case like below.
> 
> 
> 
> *USE CASE*
> 
> I have a document with fields like
> 
> 
> 
> Id,
> 
> Id_type,
> 
> Field_1.
> 
> Filed_2
> 
> 
> 
> 2 sample messages will look like
> 
> 
> 
> {
> 
>  "id": "12334",
> 
>  "id_type": "USER_RECORD",
> 
>  "field_1": null,
> 
>  "field_2": null
> 
> }
> 
> 
> 
> 
> 
> {
> 
>  "id": "31321",
> 
>  "id_type": "OWNER_RECORD",
> 
>  "field_1": null,
> 
>  "field_2": null
> 
> }
> 
> 
> 
> 
> 
> *QUESTIONS*
> 
> 
> 
> I’d like to define the unique key as a compound key from fields *id* and
> *id_type*
> 
>   1. Could someone give me an example of how to do this ? Or point to the
>   relevant section in the docs?
>   2. Is this the best way to define a compound primary key ? Is there a
>   more efficient way ?
> 
> 
> 
> *Regards,*
> 
> *Vivek*



Compound Primary Keys

2019-04-19 Thread Vivekanand
Hello,



I have a use case like below.



*USE CASE*

I have a document with fields like



Id,

Id_type,

Field_1.

Filed_2



2 sample messages will look like



{

  "id": "12334",

  "id_type": "USER_RECORD",

  "field_1": null,

  "field_2": null

}





{

  "id": "31321",

  "id_type": "OWNER_RECORD",

  "field_1": null,

  "field_2": null

}





*QUESTIONS*



I’d like to define the unique key as a compound key from fields *id* and
*id_type*

   1. Could someone give me an example of how to do this ? Or point to the
   relevant section in the docs?
   2. Is this the best way to define a compound primary key ? Is there a
   more efficient way ?



*Regards,*

*Vivek*


Compound Primary Keys

2019-04-19 Thread Vivekanand Sahay
Hello,

I have a use case like below.

USE CASE
I have a document with fields like

Id,
Id_type,
Field_1.
Filed_2

2 sample messages will look like

{
  "id": "12334",
  "id_type": "USER_RECORD",
  "field_1": null,
  "field_2": null
}


{
  "id": "31321",
  "id_type": "OWNER_RECORD",
  "field_1": null,
  "field_2": null
}


QUESTIONS

I’d like to define the unique key as a compound key from fields id and id_type

  1.  Could someone give me an example of how to do this ? Or point to the 
relevant section in the docs?
  2.  Is this the best way to define a compound primary key ? Is there a more 
efficient way ?

Regards,
Vivek