Re: Compound Primary Keys
Thanks a lot. Appreciate the pointers about indexing engine vs database . I ended up using concatenating the fields to generate a key. On Wed, Apr 24, 2019 at 11:39 AM David Hastings < hastings.recurs...@gmail.com> wrote: > another thing to consider doing is just merge the two fields into the id > value: > "id": "USER_RECORD_12334", > since its a string. > > > > On Wed, Apr 24, 2019 at 2:35 PM Gus Heck wrote: > > > Hi Vivek > > > > Solr is not a database, nor should one try to use it as such. You'll need > > to adjust your thinking some in order to make good use of Solr. In Solr > > there is normally an id field and it should be unique across EVERY > document > > in the entire collection. Thus there's no concept of a primary key, > because > > there are no tables. In some situations (streaming expressions for > example) > > you might want to use collections like tables, creating a collection per > > data type, but there's no way to define uniqueness in terms of more than > > one field within a collection. If your data comes from a database with > > complex keys, concatenating the values to form the single unique ID is a > > possibility. If you form keys that way of course you also want to retain > > the values as individual fields. This duplication might seem odd from a > > database perspective where one often works hard to normalize data, but > for > > search, denormalization is very common. The focus with search engines is > > usually speed of retrieval rather than data correctness. Solr should > serve > > as an index into some other canonical source of truth for your data, and > > that source of truth should be in charge of guaranteeing data > correctness. > > > > Another alternative is to provide a field that denotes the type (table) > for > > the document (such as id_type in your example). In that case, all queries > > looking for a specific object type as a result should add a filter (fq > > parameter) to denote the "table" and you may want to store a db_id field > to > > correlate the data with a database if that's where it came from. When > using > > the field/filter strategy you tend to inflate the number of fields in the > > index with some fields being sparsely populated and this can have some > > performance implications, and furthermore if one "table" gets updated > > frequently you wind up interfering with the caching for all data due to > > frequent opening of new searchers. On the plus side such a strategy makes > > it easier to query across multiple types simultaneously, so these > > considerations should be balanced against your usage patterns, > performance > > needs, ease of management and ease of programming. > > > > Best, > > Gus > > > > On Fri, Apr 19, 2019 at 2:10 PM Vivekanand Sahay > > wrote: > > > > > Hello, > > > > > > I have a use case like below. > > > > > > USE CASE > > > I have a document with fields like > > > > > > Id, > > > Id_type, > > > Field_1. > > > Filed_2 > > > > > > 2 sample messages will look like > > > > > > { > > > "id": "12334", > > > "id_type": "USER_RECORD", > > > "field_1": null, > > > "field_2": null > > > } > > > > > > > > > { > > > "id": "31321", > > > "id_type": "OWNER_RECORD", > > > "field_1": null, > > > "field_2": null > > > } > > > > > > > > > QUESTIONS > > > > > > I’d like to define the unique key as a compound key from fields id and > > > id_type > > > > > > 1. Could someone give me an example of how to do this ? Or point to > > the > > > relevant section in the docs? > > > 2. Is this the best way to define a compound primary key ? Is there > a > > > more efficient way ? > > > > > > Regards, > > > Vivek > > > > > > > > > -- > > http://www.the111shift.com > > >
Re: Compound Primary Keys
another thing to consider doing is just merge the two fields into the id value: "id": "USER_RECORD_12334", since its a string. On Wed, Apr 24, 2019 at 2:35 PM Gus Heck wrote: > Hi Vivek > > Solr is not a database, nor should one try to use it as such. You'll need > to adjust your thinking some in order to make good use of Solr. In Solr > there is normally an id field and it should be unique across EVERY document > in the entire collection. Thus there's no concept of a primary key, because > there are no tables. In some situations (streaming expressions for example) > you might want to use collections like tables, creating a collection per > data type, but there's no way to define uniqueness in terms of more than > one field within a collection. If your data comes from a database with > complex keys, concatenating the values to form the single unique ID is a > possibility. If you form keys that way of course you also want to retain > the values as individual fields. This duplication might seem odd from a > database perspective where one often works hard to normalize data, but for > search, denormalization is very common. The focus with search engines is > usually speed of retrieval rather than data correctness. Solr should serve > as an index into some other canonical source of truth for your data, and > that source of truth should be in charge of guaranteeing data correctness. > > Another alternative is to provide a field that denotes the type (table) for > the document (such as id_type in your example). In that case, all queries > looking for a specific object type as a result should add a filter (fq > parameter) to denote the "table" and you may want to store a db_id field to > correlate the data with a database if that's where it came from. When using > the field/filter strategy you tend to inflate the number of fields in the > index with some fields being sparsely populated and this can have some > performance implications, and furthermore if one "table" gets updated > frequently you wind up interfering with the caching for all data due to > frequent opening of new searchers. On the plus side such a strategy makes > it easier to query across multiple types simultaneously, so these > considerations should be balanced against your usage patterns, performance > needs, ease of management and ease of programming. > > Best, > Gus > > On Fri, Apr 19, 2019 at 2:10 PM Vivekanand Sahay > wrote: > > > Hello, > > > > I have a use case like below. > > > > USE CASE > > I have a document with fields like > > > > Id, > > Id_type, > > Field_1. > > Filed_2 > > > > 2 sample messages will look like > > > > { > > "id": "12334", > > "id_type": "USER_RECORD", > > "field_1": null, > > "field_2": null > > } > > > > > > { > > "id": "31321", > > "id_type": "OWNER_RECORD", > > "field_1": null, > > "field_2": null > > } > > > > > > QUESTIONS > > > > I’d like to define the unique key as a compound key from fields id and > > id_type > > > > 1. Could someone give me an example of how to do this ? Or point to > the > > relevant section in the docs? > > 2. Is this the best way to define a compound primary key ? Is there a > > more efficient way ? > > > > Regards, > > Vivek > > > > > -- > http://www.the111shift.com >
Re: Compound Primary Keys
Hi Vivek Solr is not a database, nor should one try to use it as such. You'll need to adjust your thinking some in order to make good use of Solr. In Solr there is normally an id field and it should be unique across EVERY document in the entire collection. Thus there's no concept of a primary key, because there are no tables. In some situations (streaming expressions for example) you might want to use collections like tables, creating a collection per data type, but there's no way to define uniqueness in terms of more than one field within a collection. If your data comes from a database with complex keys, concatenating the values to form the single unique ID is a possibility. If you form keys that way of course you also want to retain the values as individual fields. This duplication might seem odd from a database perspective where one often works hard to normalize data, but for search, denormalization is very common. The focus with search engines is usually speed of retrieval rather than data correctness. Solr should serve as an index into some other canonical source of truth for your data, and that source of truth should be in charge of guaranteeing data correctness. Another alternative is to provide a field that denotes the type (table) for the document (such as id_type in your example). In that case, all queries looking for a specific object type as a result should add a filter (fq parameter) to denote the "table" and you may want to store a db_id field to correlate the data with a database if that's where it came from. When using the field/filter strategy you tend to inflate the number of fields in the index with some fields being sparsely populated and this can have some performance implications, and furthermore if one "table" gets updated frequently you wind up interfering with the caching for all data due to frequent opening of new searchers. On the plus side such a strategy makes it easier to query across multiple types simultaneously, so these considerations should be balanced against your usage patterns, performance needs, ease of management and ease of programming. Best, Gus On Fri, Apr 19, 2019 at 2:10 PM Vivekanand Sahay wrote: > Hello, > > I have a use case like below. > > USE CASE > I have a document with fields like > > Id, > Id_type, > Field_1. > Filed_2 > > 2 sample messages will look like > > { > "id": "12334", > "id_type": "USER_RECORD", > "field_1": null, > "field_2": null > } > > > { > "id": "31321", > "id_type": "OWNER_RECORD", > "field_1": null, > "field_2": null > } > > > QUESTIONS > > I’d like to define the unique key as a compound key from fields id and > id_type > > 1. Could someone give me an example of how to do this ? Or point to the > relevant section in the docs? > 2. Is this the best way to define a compound primary key ? Is there a > more efficient way ? > > Regards, > Vivek > -- http://www.the111shift.com
Re: Compound Primary Keys
Thanks On Fri, Apr 19, 2019 at 7:58 PM Erick Erickson wrote: > Yep. There’s no euqivalent of an RDBMSs composite key in Solr OOB. > > > On Apr 19, 2019, at 4:28 PM, Vivekanand wrote: > > > > When you say roll your own , you mean , create a single field by > > concatenation so that the result is unique ? Like USER_RECORD_12334 ? > > > > On Friday, April 19, 2019, Erick Erickson > wrote: > > > >> Basically you have to roll your own. You could do this when you assemble > >> the document on the client or use an UpdateRequestProcessor. If the > latter, > >> by very, very sure you get it in the right place, specifically _before_ > the > >> doc is routed. > >> > >> But I’d just assemble it on the client when I created the doc. > >> > >> Best, > >> Erick > >> > >>> On Apr 19, 2019, at 10:40 AM, Vivekanand wrote: > >>> > >>> Hello, > >>> > >>> > >>> > >>> I have a use case like below. > >>> > >>> > >>> > >>> *USE CASE* > >>> > >>> I have a document with fields like > >>> > >>> > >>> > >>> Id, > >>> > >>> Id_type, > >>> > >>> Field_1. > >>> > >>> Filed_2 > >>> > >>> > >>> > >>> 2 sample messages will look like > >>> > >>> > >>> > >>> { > >>> > >>> "id": "12334", > >>> > >>> "id_type": "USER_RECORD", > >>> > >>> "field_1": null, > >>> > >>> "field_2": null > >>> > >>> } > >>> > >>> > >>> > >>> > >>> > >>> { > >>> > >>> "id": "31321", > >>> > >>> "id_type": "OWNER_RECORD", > >>> > >>> "field_1": null, > >>> > >>> "field_2": null > >>> > >>> } > >>> > >>> > >>> > >>> > >>> > >>> *QUESTIONS* > >>> > >>> > >>> > >>> I’d like to define the unique key as a compound key from fields *id* > and > >>> *id_type* > >>> > >>> 1. Could someone give me an example of how to do this ? Or point to > the > >>> relevant section in the docs? > >>> 2. Is this the best way to define a compound primary key ? Is there a > >>> more efficient way ? > >>> > >>> > >>> > >>> *Regards,* > >>> > >>> *Vivek* > >> > >> > >
Re: Compound Primary Keys
Yep. There’s no euqivalent of an RDBMSs composite key in Solr OOB. > On Apr 19, 2019, at 4:28 PM, Vivekanand wrote: > > When you say roll your own , you mean , create a single field by > concatenation so that the result is unique ? Like USER_RECORD_12334 ? > > On Friday, April 19, 2019, Erick Erickson wrote: > >> Basically you have to roll your own. You could do this when you assemble >> the document on the client or use an UpdateRequestProcessor. If the latter, >> by very, very sure you get it in the right place, specifically _before_ the >> doc is routed. >> >> But I’d just assemble it on the client when I created the doc. >> >> Best, >> Erick >> >>> On Apr 19, 2019, at 10:40 AM, Vivekanand wrote: >>> >>> Hello, >>> >>> >>> >>> I have a use case like below. >>> >>> >>> >>> *USE CASE* >>> >>> I have a document with fields like >>> >>> >>> >>> Id, >>> >>> Id_type, >>> >>> Field_1. >>> >>> Filed_2 >>> >>> >>> >>> 2 sample messages will look like >>> >>> >>> >>> { >>> >>> "id": "12334", >>> >>> "id_type": "USER_RECORD", >>> >>> "field_1": null, >>> >>> "field_2": null >>> >>> } >>> >>> >>> >>> >>> >>> { >>> >>> "id": "31321", >>> >>> "id_type": "OWNER_RECORD", >>> >>> "field_1": null, >>> >>> "field_2": null >>> >>> } >>> >>> >>> >>> >>> >>> *QUESTIONS* >>> >>> >>> >>> I’d like to define the unique key as a compound key from fields *id* and >>> *id_type* >>> >>> 1. Could someone give me an example of how to do this ? Or point to the >>> relevant section in the docs? >>> 2. Is this the best way to define a compound primary key ? Is there a >>> more efficient way ? >>> >>> >>> >>> *Regards,* >>> >>> *Vivek* >> >>
Re: Compound Primary Keys
When you say roll your own , you mean , create a single field by concatenation so that the result is unique ? Like USER_RECORD_12334 ? On Friday, April 19, 2019, Erick Erickson wrote: > Basically you have to roll your own. You could do this when you assemble > the document on the client or use an UpdateRequestProcessor. If the latter, > by very, very sure you get it in the right place, specifically _before_ the > doc is routed. > > But I’d just assemble it on the client when I created the doc. > > Best, > Erick > > > On Apr 19, 2019, at 10:40 AM, Vivekanand wrote: > > > > Hello, > > > > > > > > I have a use case like below. > > > > > > > > *USE CASE* > > > > I have a document with fields like > > > > > > > > Id, > > > > Id_type, > > > > Field_1. > > > > Filed_2 > > > > > > > > 2 sample messages will look like > > > > > > > > { > > > > "id": "12334", > > > > "id_type": "USER_RECORD", > > > > "field_1": null, > > > > "field_2": null > > > > } > > > > > > > > > > > > { > > > > "id": "31321", > > > > "id_type": "OWNER_RECORD", > > > > "field_1": null, > > > > "field_2": null > > > > } > > > > > > > > > > > > *QUESTIONS* > > > > > > > > I’d like to define the unique key as a compound key from fields *id* and > > *id_type* > > > > 1. Could someone give me an example of how to do this ? Or point to the > > relevant section in the docs? > > 2. Is this the best way to define a compound primary key ? Is there a > > more efficient way ? > > > > > > > > *Regards,* > > > > *Vivek* > >
Re: Compound Primary Keys
Basically you have to roll your own. You could do this when you assemble the document on the client or use an UpdateRequestProcessor. If the latter, by very, very sure you get it in the right place, specifically _before_ the doc is routed. But I’d just assemble it on the client when I created the doc. Best, Erick > On Apr 19, 2019, at 10:40 AM, Vivekanand wrote: > > Hello, > > > > I have a use case like below. > > > > *USE CASE* > > I have a document with fields like > > > > Id, > > Id_type, > > Field_1. > > Filed_2 > > > > 2 sample messages will look like > > > > { > > "id": "12334", > > "id_type": "USER_RECORD", > > "field_1": null, > > "field_2": null > > } > > > > > > { > > "id": "31321", > > "id_type": "OWNER_RECORD", > > "field_1": null, > > "field_2": null > > } > > > > > > *QUESTIONS* > > > > I’d like to define the unique key as a compound key from fields *id* and > *id_type* > > 1. Could someone give me an example of how to do this ? Or point to the > relevant section in the docs? > 2. Is this the best way to define a compound primary key ? Is there a > more efficient way ? > > > > *Regards,* > > *Vivek*
Compound Primary Keys
Hello, I have a use case like below. *USE CASE* I have a document with fields like Id, Id_type, Field_1. Filed_2 2 sample messages will look like { "id": "12334", "id_type": "USER_RECORD", "field_1": null, "field_2": null } { "id": "31321", "id_type": "OWNER_RECORD", "field_1": null, "field_2": null } *QUESTIONS* I’d like to define the unique key as a compound key from fields *id* and *id_type* 1. Could someone give me an example of how to do this ? Or point to the relevant section in the docs? 2. Is this the best way to define a compound primary key ? Is there a more efficient way ? *Regards,* *Vivek*
Compound Primary Keys
Hello, I have a use case like below. USE CASE I have a document with fields like Id, Id_type, Field_1. Filed_2 2 sample messages will look like { "id": "12334", "id_type": "USER_RECORD", "field_1": null, "field_2": null } { "id": "31321", "id_type": "OWNER_RECORD", "field_1": null, "field_2": null } QUESTIONS I’d like to define the unique key as a compound key from fields id and id_type 1. Could someone give me an example of how to do this ? Or point to the relevant section in the docs? 2. Is this the best way to define a compound primary key ? Is there a more efficient way ? Regards, Vivek