Re: [openstack-dev] [Solum] Regarding language pack database schema
I'm also a +1 for #2.However as discussed on IRC, we should clearly spell out that the JSON blob should never be treated in a SQL-like manner. The moment somebody says 'I want to make that item in the json searchable' is the time to discuss adding it as part of the SQL schema. On 2/13/14 4:39 PM, Clayton Coleman ccole...@redhat.com wrote: I like option #2, simply because we should force ourselves to justify every attribute that is extracted as a queryable parameter, rather than making them queryable at the start. - Original Message - Hi Arati, I would vote for Option #2 as a short term solution. Probably later we can consider using NoSQL DB or MariaDB which has Column_JSON type to store complex types. Thanks Georgy On Thu, Feb 13, 2014 at 8:12 AM, Arati Mahimane arati.mahim...@rackspace.com wrote: Hi All, I have been working on defining the Language pack database schema. Here is a link to my review which is still a WIP - https://review.openstack.org/#/c/71132/3 . There are a couple of different opinions on how we should be designing the schema. Language pack has several complex attributes which are listed here - https://etherpad.openstack.org/p/Solum-Language-pack-json-format We need to support search queries on language packs based on various criteria. One example could be 'find a language pack where type='java' and version1.4' Following are the two options that are currently being discussed for the DB schema: Option 1: Having a separate table for each complex attribute, in order to achieve normalization. The current schema follows this approach. However, this design has certain drawbacks. It will result in a lot of complex DB queries and each new attribute will require a code change. Option 2: We could have a predefined subset of attributes on which we would support search queries. In this case, we would define columns (separate tables in case of complex attributes) only for this subset of attributes and all other attributes will be a part of a json blob. With this option, we will have to go through a schema change in case we decide to support search queries on other attributes at a later stage. I would like to know everyone's thoughts on these two approaches so that we can take a final decision and go ahead with one approach. Suggestions regarding any other approaches are welcome too! Thanks, Arati ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Georgy Okrokvertskhov Architect, OpenStack Platform Products, Mirantis http://www.mirantis.com Tel. +1 650 963 9828 Mob. +1 650 996 3284 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Solum] Regarding language pack database schema
I agree. Let's proceed with option #2, and submit a wishlist bug to track this as tech debt. We would like to come back to this later and add an option to use a blob store for the JSON blob content, as Georgy mentioned. These could be stored in swift, or a K/V store. It might be nice to have a thin get/set abstraction there to allow alternates to be implemented as needed. I'm not sure exactly where we can track Paul Czarkowski's suggested restriction. We may need to just rely on reviewers to prevent this, because if we ever start introspecting the JSON blob, we will be using an SQL anti-pattern. I'm generally opposed to putting arbitrary sized text and blob entries into a SQL database, because eventually you may run into the maximum allowable size (ie: max-allowed-packet) and cause unexpected error conditions. Thanks, Adrian On Feb 18, 2014, at 8:48 AM, Paul Czarkowski paul.czarkow...@rackspace.com wrote: I'm also a +1 for #2.However as discussed on IRC, we should clearly spell out that the JSON blob should never be treated in a SQL-like manner. The moment somebody says 'I want to make that item in the json searchable' is the time to discuss adding it as part of the SQL schema. On 2/13/14 4:39 PM, Clayton Coleman ccole...@redhat.com wrote: I like option #2, simply because we should force ourselves to justify every attribute that is extracted as a queryable parameter, rather than making them queryable at the start. - Original Message - Hi Arati, I would vote for Option #2 as a short term solution. Probably later we can consider using NoSQL DB or MariaDB which has Column_JSON type to store complex types. Thanks Georgy On Thu, Feb 13, 2014 at 8:12 AM, Arati Mahimane arati.mahim...@rackspace.com wrote: Hi All, I have been working on defining the Language pack database schema. Here is a link to my review which is still a WIP - https://review.openstack.org/#/c/71132/3 . There are a couple of different opinions on how we should be designing the schema. Language pack has several complex attributes which are listed here - https://etherpad.openstack.org/p/Solum-Language-pack-json-format We need to support search queries on language packs based on various criteria. One example could be 'find a language pack where type='java' and version1.4' Following are the two options that are currently being discussed for the DB schema: Option 1: Having a separate table for each complex attribute, in order to achieve normalization. The current schema follows this approach. However, this design has certain drawbacks. It will result in a lot of complex DB queries and each new attribute will require a code change. Option 2: We could have a predefined subset of attributes on which we would support search queries. In this case, we would define columns (separate tables in case of complex attributes) only for this subset of attributes and all other attributes will be a part of a json blob. With this option, we will have to go through a schema change in case we decide to support search queries on other attributes at a later stage. I would like to know everyone's thoughts on these two approaches so that we can take a final decision and go ahead with one approach. Suggestions regarding any other approaches are welcome too! Thanks, Arati ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Georgy Okrokvertskhov Architect, OpenStack Platform Products, Mirantis http://www.mirantis.com Tel. +1 650 963 9828 Mob. +1 650 996 3284 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Solum] Regarding language pack database schema
Maybe a crazy idea butÅ What if we simply don't store the JSON blob data for M1 instead of putting storing it in a way we don't like long term? This way, there is no need to remember to change something later even though a bug could be created anyways. I believe the fields that would be missing/not stored in the blob are: * Compiler version * Language platform * OS platform Can we live with that for M1? On 2/18/14 12:07 PM, Adrian Otto adrian.o...@rackspace.com wrote: I agree. Let's proceed with option #2, and submit a wishlist bug to track this as tech debt. We would like to come back to this later and add an option to use a blob store for the JSON blob content, as Georgy mentioned. These could be stored in swift, or a K/V store. It might be nice to have a thin get/set abstraction there to allow alternates to be implemented as needed. I'm not sure exactly where we can track Paul Czarkowski's suggested restriction. We may need to just rely on reviewers to prevent this, because if we ever start introspecting the JSON blob, we will be using an SQL anti-pattern. I'm generally opposed to putting arbitrary sized text and blob entries into a SQL database, because eventually you may run into the maximum allowable size (ie: max-allowed-packet) and cause unexpected error conditions. Thanks, Adrian On Feb 18, 2014, at 8:48 AM, Paul Czarkowski paul.czarkow...@rackspace.com wrote: I'm also a +1 for #2.However as discussed on IRC, we should clearly spell out that the JSON blob should never be treated in a SQL-like manner. The moment somebody says 'I want to make that item in the json searchable' is the time to discuss adding it as part of the SQL schema. On 2/13/14 4:39 PM, Clayton Coleman ccole...@redhat.com wrote: I like option #2, simply because we should force ourselves to justify every attribute that is extracted as a queryable parameter, rather than making them queryable at the start. - Original Message - Hi Arati, I would vote for Option #2 as a short term solution. Probably later we can consider using NoSQL DB or MariaDB which has Column_JSON type to store complex types. Thanks Georgy On Thu, Feb 13, 2014 at 8:12 AM, Arati Mahimane arati.mahim...@rackspace.com wrote: Hi All, I have been working on defining the Language pack database schema. Here is a link to my review which is still a WIP - https://review.openstack.org/#/c/71132/3 . There are a couple of different opinions on how we should be designing the schema. Language pack has several complex attributes which are listed here - https://etherpad.openstack.org/p/Solum-Language-pack-json-format We need to support search queries on language packs based on various criteria. One example could be 'find a language pack where type='java' and version1.4' Following are the two options that are currently being discussed for the DB schema: Option 1: Having a separate table for each complex attribute, in order to achieve normalization. The current schema follows this approach. However, this design has certain drawbacks. It will result in a lot of complex DB queries and each new attribute will require a code change. Option 2: We could have a predefined subset of attributes on which we would support search queries. In this case, we would define columns (separate tables in case of complex attributes) only for this subset of attributes and all other attributes will be a part of a json blob. With this option, we will have to go through a schema change in case we decide to support search queries on other attributes at a later stage. I would like to know everyone's thoughts on these two approaches so that we can take a final decision and go ahead with one approach. Suggestions regarding any other approaches are welcome too! Thanks, Arati ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Georgy Okrokvertskhov Architect, OpenStack Platform Products, Mirantis http://www.mirantis.com Tel. +1 650 963 9828 Mob. +1 650 996 3284 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list
Re: [openstack-dev] [Solum] Regarding language pack database schema
That is exactly option #2 which propose to store attributes in columns. So there will be a limited set of attributes and each of them will have its own column in a table. Thanks Georgy On Tue, Feb 18, 2014 at 10:55 AM, Paul Montgomery paul.montgom...@rackspace.com wrote: Maybe a crazy idea butÅ What if we simply don't store the JSON blob data for M1 instead of putting storing it in a way we don't like long term? This way, there is no need to remember to change something later even though a bug could be created anyways. I believe the fields that would be missing/not stored in the blob are: * Compiler version * Language platform * OS platform Can we live with that for M1? On 2/18/14 12:07 PM, Adrian Otto adrian.o...@rackspace.com wrote: I agree. Let's proceed with option #2, and submit a wishlist bug to track this as tech debt. We would like to come back to this later and add an option to use a blob store for the JSON blob content, as Georgy mentioned. These could be stored in swift, or a K/V store. It might be nice to have a thin get/set abstraction there to allow alternates to be implemented as needed. I'm not sure exactly where we can track Paul Czarkowski's suggested restriction. We may need to just rely on reviewers to prevent this, because if we ever start introspecting the JSON blob, we will be using an SQL anti-pattern. I'm generally opposed to putting arbitrary sized text and blob entries into a SQL database, because eventually you may run into the maximum allowable size (ie: max-allowed-packet) and cause unexpected error conditions. Thanks, Adrian On Feb 18, 2014, at 8:48 AM, Paul Czarkowski paul.czarkow...@rackspace.com wrote: I'm also a +1 for #2.However as discussed on IRC, we should clearly spell out that the JSON blob should never be treated in a SQL-like manner. The moment somebody says 'I want to make that item in the json searchable' is the time to discuss adding it as part of the SQL schema. On 2/13/14 4:39 PM, Clayton Coleman ccole...@redhat.com wrote: I like option #2, simply because we should force ourselves to justify every attribute that is extracted as a queryable parameter, rather than making them queryable at the start. - Original Message - Hi Arati, I would vote for Option #2 as a short term solution. Probably later we can consider using NoSQL DB or MariaDB which has Column_JSON type to store complex types. Thanks Georgy On Thu, Feb 13, 2014 at 8:12 AM, Arati Mahimane arati.mahim...@rackspace.com wrote: Hi All, I have been working on defining the Language pack database schema. Here is a link to my review which is still a WIP - https://review.openstack.org/#/c/71132/3 . There are a couple of different opinions on how we should be designing the schema. Language pack has several complex attributes which are listed here - https://etherpad.openstack.org/p/Solum-Language-pack-json-format We need to support search queries on language packs based on various criteria. One example could be 'find a language pack where type='java' and version1.4' Following are the two options that are currently being discussed for the DB schema: Option 1: Having a separate table for each complex attribute, in order to achieve normalization. The current schema follows this approach. However, this design has certain drawbacks. It will result in a lot of complex DB queries and each new attribute will require a code change. Option 2: We could have a predefined subset of attributes on which we would support search queries. In this case, we would define columns (separate tables in case of complex attributes) only for this subset of attributes and all other attributes will be a part of a json blob. With this option, we will have to go through a schema change in case we decide to support search queries on other attributes at a later stage. I would like to know everyone's thoughts on these two approaches so that we can take a final decision and go ahead with one approach. Suggestions regarding any other approaches are welcome too! Thanks, Arati ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Georgy Okrokvertskhov Architect, OpenStack Platform Products, Mirantis http://www.mirantis.com Tel. +1 650 963 9828 Mob. +1 650 996 3284 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Solum] Regarding language pack database schema
Hi Arati, I would vote for Option #2 as a short term solution. Probably later we can consider using NoSQL DB or MariaDB which has Column_JSON type to store complex types. Thanks Georgy On Thu, Feb 13, 2014 at 8:12 AM, Arati Mahimane arati.mahim...@rackspace.com wrote: Hi All, I have been working on defining the Language pack database schema. Here is a link to my review which is still a WIP - https://review.openstack.org/#/c/71132/3. There are a couple of different opinions on how we should be designing the schema. Language pack has several complex attributes which are listed here - https://etherpad.openstack.org/p/Solum-Language-pack-json-format We need to support search queries on language packs based on various criteria. One example could be 'find a language pack where type='java' and version1.4' Following are the two options that are currently being discussed for the DB schema: *Option 1:* Having a separate table for each complex attribute, in order to achieve normalization. The current schema follows this approach. However, this design has certain drawbacks. It will result in a lot of complex DB queries and each new attribute will require a code change. *Option 2:* We could have a predefined subset of attributes on which we would support search queries. In this case, we would define columns (separate tables in case of complex attributes) only for this subset of attributes and all other attributes will be a part of a json blob. With this option, we will have to go through a schema change in case we decide to support search queries on other attributes at a later stage. I would like to know everyone's thoughts on these two approaches so that we can take a final decision and go ahead with one approach. Suggestions regarding any other approaches are welcome too! Thanks, Arati ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Georgy Okrokvertskhov Architect, OpenStack Platform Products, Mirantis http://www.mirantis.com Tel. +1 650 963 9828 Mob. +1 650 996 3284 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Solum] Regarding language pack database schema
I like option #2, simply because we should force ourselves to justify every attribute that is extracted as a queryable parameter, rather than making them queryable at the start. - Original Message - Hi Arati, I would vote for Option #2 as a short term solution. Probably later we can consider using NoSQL DB or MariaDB which has Column_JSON type to store complex types. Thanks Georgy On Thu, Feb 13, 2014 at 8:12 AM, Arati Mahimane arati.mahim...@rackspace.com wrote: Hi All, I have been working on defining the Language pack database schema. Here is a link to my review which is still a WIP - https://review.openstack.org/#/c/71132/3 . There are a couple of different opinions on how we should be designing the schema. Language pack has several complex attributes which are listed here - https://etherpad.openstack.org/p/Solum-Language-pack-json-format We need to support search queries on language packs based on various criteria. One example could be 'find a language pack where type='java' and version1.4' Following are the two options that are currently being discussed for the DB schema: Option 1: Having a separate table for each complex attribute, in order to achieve normalization. The current schema follows this approach. However, this design has certain drawbacks. It will result in a lot of complex DB queries and each new attribute will require a code change. Option 2: We could have a predefined subset of attributes on which we would support search queries. In this case, we would define columns (separate tables in case of complex attributes) only for this subset of attributes and all other attributes will be a part of a json blob. With this option, we will have to go through a schema change in case we decide to support search queries on other attributes at a later stage. I would like to know everyone's thoughts on these two approaches so that we can take a final decision and go ahead with one approach. Suggestions regarding any other approaches are welcome too! Thanks, Arati ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Georgy Okrokvertskhov Architect, OpenStack Platform Products, Mirantis http://www.mirantis.com Tel. +1 650 963 9828 Mob. +1 650 996 3284 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev