Re: [openstack-dev] [glance] Why no DB index on sort parameters

2015-04-27 Thread Rushi Agrawal
Now that raises a question: do we really need sorting based on arbitrary
keys in our API (e.g. listing images, volumes, instances)? If we have this
feature in our API, we're bound to run into problems by creating or not
creating indexes, at large volumes -- hurts our motive to be
easily-implementable for clouds of all sizes.

-Rushi

On 23 April 2015 at 20:40, Nikhil Komawar nikhil.koma...@rackspace.com
wrote:

 Messing with indices is not a good idea to do iteratively.  Indexing large
 data sets is a really expensive operation and should be done carefully and
 consistently. Changing around indices is only going to make things unstable.

 Thanks,
 -Nikhil

 
 From: Flavio Percoco fla...@redhat.com
 Sent: Thursday, April 23, 2015 7:52 AM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters

 On 21/04/15 14:55 +, Nikhil Komawar wrote:
 Rally is great overall however, we need good EXPLAIN examples on real
 world data. Smaller deployments might benefit from a simple sample
 performance analysis however, larger data sets can have impacts on areas
 that you never expect.
 
 A spec means that we document the indices proposed in the code base,
 based on all of the use cases. The way I look at it, a patch is needed
 anyways and it (rally gate job) would get attention from reviewers when the
 patch is proposed.

 Yes, I believe we need both. However, I'd probably just start with
 something smaller and see how it behaves before going with big data
 sets.

 I'm not saying we don't need tests with proper data sets, I'm saying
 that I'd probably start with smaller ones. As Mike already mentioned
 in his email, there's an impact in writes and we can see that from
 Rally tests, AFAICT.

 The spec can come later, IMHO.

 Cheers,
 Flavio

 
 
 From: Flavio Percoco fla...@redhat.com
 Sent: Tuesday, April 21, 2015 10:48 AM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters
 
 On 21/04/15 14:39 +, Nikhil Komawar wrote:
 This is a good idea. We recently removed a unique constraint that may
 result
 into some queries being very slow especially those that involve name
 property. I would recommend sketching out a spec that identifies
 potential full
 table scans especially for queries that join over image_properties table.
 
 
 We should discuss there what other use cases look like rather than
 smaller
 feedback on the ML.
 
 More thatn a spec, I'd be interested in seeing the patch with the
 change up and the results reported in Rally.
 
 I guess we'll need a spec anyway, although I'd probably be ok with a
 good bug report here.
 
 /me *shrugs*
 Flavio
 
 
 
 Thanks,
 -Nikhil

 ━━━
 From: Mike Bayer mba...@redhat.com
 Sent: Tuesday, April 21, 2015 9:45 AM
 To: openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters
 
 
 
 On 4/21/15 2:47 AM, Ajaya Agrawal wrote:
 
 Hi All,
 
 I see that glance supports arbitrary sort parameters and the default
 is
 created_at while listing images. Is there any reason why we don't
 have
 index over these fields? If we have an index over these fields then
 we
 would avoid a full table scan to do sorting. IMO at least the
 created_at
 field should have an index on it.
 
 just keep in mind that more indexes will place a performance penalty on
 INSERT
 statements, particularly at larger volumes.  I have no idea if that is
 important here but something to keep in mind.
 
 
 
 
 
 
 Cheers,
 Ajaya
 
 
 
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 --
 @flaper87
 Flavio Percoco
 
 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
 openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

 --
 @flaper87
 Flavio Percoco

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo

Re: [openstack-dev] [glance] Why no DB index on sort parameters

2015-04-27 Thread Jay Pipes
At the very least, an index on the default sort column (created_at) 
would be appropriate, IMO.


Best,
-jay

On 04/27/2015 01:42 PM, Rushi Agrawal wrote:

Now that raises a question: do we really need sorting based on arbitrary
keys in our API (e.g. listing images, volumes, instances)? If we have
this feature in our API, we're bound to run into problems by creating or
not creating indexes, at large volumes -- hurts our motive to be
easily-implementable for clouds of all sizes.

-Rushi

On 23 April 2015 at 20:40, Nikhil Komawar nikhil.koma...@rackspace.com
mailto:nikhil.koma...@rackspace.com wrote:

Messing with indices is not a good idea to do iteratively.  Indexing
large data sets is a really expensive operation and should be done
carefully and consistently. Changing around indices is only going to
make things unstable.

Thanks,
-Nikhil


From: Flavio Percoco fla...@redhat.com mailto:fla...@redhat.com
Sent: Thursday, April 23, 2015 7:52 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters

On 21/04/15 14:55 +, Nikhil Komawar wrote:
 Rally is great overall however, we need good EXPLAIN examples on
real world data. Smaller deployments might benefit from a simple
sample performance analysis however, larger data sets can have
impacts on areas that you never expect.
 
 A spec means that we document the indices proposed in the code
base, based on all of the use cases. The way I look at it, a patch
is needed anyways and it (rally gate job) would get attention from
reviewers when the patch is proposed.

Yes, I believe we need both. However, I'd probably just start with
something smaller and see how it behaves before going with big data
sets.

I'm not saying we don't need tests with proper data sets, I'm saying
that I'd probably start with smaller ones. As Mike already mentioned
in his email, there's an impact in writes and we can see that from
Rally tests, AFAICT.

The spec can come later, IMHO.

Cheers,
Flavio

 
 
 From: Flavio Percoco fla...@redhat.com mailto:fla...@redhat.com
 Sent: Tuesday, April 21, 2015 10:48 AM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [glance] Why no DB index on sort
parameters
 
 On 21/04/15 14:39 +, Nikhil Komawar wrote:
 This is a good idea. We recently removed a unique constraint that
may result
 into some queries being very slow especially those that involve
name
 property. I would recommend sketching out a spec that identifies
potential full
 table scans especially for queries that join over
image_properties table.
 
 
 We should discuss there what other use cases look like rather
than smaller
 feedback on the ML.
 
 More thatn a spec, I'd be interested in seeing the patch with the
 change up and the results reported in Rally.
 
 I guess we'll need a spec anyway, although I'd probably be ok with a
 good bug report here.
 
 /me *shrugs*
 Flavio
 
 
 
 Thanks,
 -Nikhil
 
━━━
 From: Mike Bayer mba...@redhat.com mailto:mba...@redhat.com
 Sent: Tuesday, April 21, 2015 9:45 AM
 To: openstack-dev@lists.openstack.org
mailto:openstack-dev@lists.openstack.org
 Subject: Re: [openstack-dev] [glance] Why no DB index on sort
parameters
 
 
 
 On 4/21/15 2:47 AM, Ajaya Agrawal wrote:
 
 Hi All,
 
 I see that glance supports arbitrary sort parameters and the
default is
 created_at while listing images. Is there any reason why we
don't have
 index over these fields? If we have an index over these
fields then we
 would avoid a full table scan to do sorting. IMO at least the
created_at
 field should have an index on it.
 
 just keep in mind that more indexes will place a performance
penalty on INSERT
 statements, particularly at larger volumes.  I have no idea if
that is
 important here but something to keep in mind.
 
 
 
 
 
 
 Cheers,
 Ajaya
 
 
 
 
__
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 
 
__
 OpenStack

Re: [openstack-dev] [glance] Why no DB index on sort parameters

2015-04-23 Thread Nikhil Komawar
Messing with indices is not a good idea to do iteratively.  Indexing large data 
sets is a really expensive operation and should be done carefully and 
consistently. Changing around indices is only going to make things unstable.

Thanks,
-Nikhil


From: Flavio Percoco fla...@redhat.com
Sent: Thursday, April 23, 2015 7:52 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters

On 21/04/15 14:55 +, Nikhil Komawar wrote:
Rally is great overall however, we need good EXPLAIN examples on real world 
data. Smaller deployments might benefit from a simple sample performance 
analysis however, larger data sets can have impacts on areas that you never 
expect.

A spec means that we document the indices proposed in the code base, based on 
all of the use cases. The way I look at it, a patch is needed anyways and it 
(rally gate job) would get attention from reviewers when the patch is proposed.

Yes, I believe we need both. However, I'd probably just start with
something smaller and see how it behaves before going with big data
sets.

I'm not saying we don't need tests with proper data sets, I'm saying
that I'd probably start with smaller ones. As Mike already mentioned
in his email, there's an impact in writes and we can see that from
Rally tests, AFAICT.

The spec can come later, IMHO.

Cheers,
Flavio



From: Flavio Percoco fla...@redhat.com
Sent: Tuesday, April 21, 2015 10:48 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters

On 21/04/15 14:39 +, Nikhil Komawar wrote:
This is a good idea. We recently removed a unique constraint that may result
into some queries being very slow especially those that involve name
property. I would recommend sketching out a spec that identifies potential 
full
table scans especially for queries that join over image_properties table.


We should discuss there what other use cases look like rather than smaller
feedback on the ML.

More thatn a spec, I'd be interested in seeing the patch with the
change up and the results reported in Rally.

I guess we'll need a spec anyway, although I'd probably be ok with a
good bug report here.

/me *shrugs*
Flavio



Thanks,
-Nikhil
━━━
From: Mike Bayer mba...@redhat.com
Sent: Tuesday, April 21, 2015 9:45 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters



On 4/21/15 2:47 AM, Ajaya Agrawal wrote:

Hi All,

I see that glance supports arbitrary sort parameters and the default is
created_at while listing images. Is there any reason why we don't have
index over these fields? If we have an index over these fields then we
would avoid a full table scan to do sorting. IMO at least the created_at
field should have an index on it.

just keep in mind that more indexes will place a performance penalty on INSERT
statements, particularly at larger volumes.  I have no idea if that is
important here but something to keep in mind.






Cheers,
Ajaya



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


--
@flaper87
Flavio Percoco

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--
@flaper87
Flavio Percoco

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [glance] Why no DB index on sort parameters

2015-04-23 Thread Flavio Percoco

On 21/04/15 14:55 +, Nikhil Komawar wrote:

Rally is great overall however, we need good EXPLAIN examples on real world 
data. Smaller deployments might benefit from a simple sample performance 
analysis however, larger data sets can have impacts on areas that you never 
expect.

A spec means that we document the indices proposed in the code base, based on 
all of the use cases. The way I look at it, a patch is needed anyways and it 
(rally gate job) would get attention from reviewers when the patch is proposed.


Yes, I believe we need both. However, I'd probably just start with
something smaller and see how it behaves before going with big data
sets.

I'm not saying we don't need tests with proper data sets, I'm saying
that I'd probably start with smaller ones. As Mike already mentioned
in his email, there's an impact in writes and we can see that from
Rally tests, AFAICT.

The spec can come later, IMHO.

Cheers,
Flavio




From: Flavio Percoco fla...@redhat.com
Sent: Tuesday, April 21, 2015 10:48 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters

On 21/04/15 14:39 +, Nikhil Komawar wrote:

This is a good idea. We recently removed a unique constraint that may result
into some queries being very slow especially those that involve name
property. I would recommend sketching out a spec that identifies potential full
table scans especially for queries that join over image_properties table.


We should discuss there what other use cases look like rather than smaller
feedback on the ML.


More thatn a spec, I'd be interested in seeing the patch with the
change up and the results reported in Rally.

I guess we'll need a spec anyway, although I'd probably be ok with a
good bug report here.

/me *shrugs*
Flavio




Thanks,
-Nikhil
━━━
From: Mike Bayer mba...@redhat.com
Sent: Tuesday, April 21, 2015 9:45 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters



On 4/21/15 2:47 AM, Ajaya Agrawal wrote:

   Hi All,

   I see that glance supports arbitrary sort parameters and the default is
   created_at while listing images. Is there any reason why we don't have
   index over these fields? If we have an index over these fields then we
   would avoid a full table scan to do sorting. IMO at least the created_at
   field should have an index on it.

just keep in mind that more indexes will place a performance penalty on INSERT
statements, particularly at larger volumes.  I have no idea if that is
important here but something to keep in mind.






   Cheers,
   Ajaya



   __
   OpenStack Development Mailing List (not for usage questions)
   Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--
@flaper87
Flavio Percoco

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


--
@flaper87
Flavio Percoco


pgpL5YJyWI2GL.pgp
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [glance] Why no DB index on sort parameters

2015-04-21 Thread Ajaya Agrawal
Hi All,

I see that glance supports arbitrary sort parameters and the default is
created_at while listing images. Is there any reason why we don't have
index over these fields? If we have an index over these fields then we
would avoid a full table scan to do sorting. IMO at least the created_at
field should have an index on it.

Cheers,
Ajaya
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [glance] Why no DB index on sort parameters

2015-04-21 Thread Flavio Percoco

On 21/04/15 12:17 +0530, Ajaya Agrawal wrote:

Hi All,

I see that glance supports arbitrary sort parameters and the default is
created_at while listing images. Is there any reason why we don't have index
over these fields? If we have an index over these fields then we would avoid a
full table scan to do sorting. IMO at least the created_at field should have an
index on it.


I don't think there's an actual reason, OTOH. I'd recommend proposing
a patch and wait for the reviews there. :)

Flavio


--
@flaper87
Flavio Percoco


pgp3yiNZiCDnJ.pgp
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [glance] Why no DB index on sort parameters

2015-04-21 Thread Nikhil Komawar
This is a good idea. We recently removed a unique constraint that may result 
into some queries being very slow especially those that involve name 
property. I would recommend sketching out a spec that identifies potential full 
table scans especially for queries that join over image_properties table.


We should discuss there what other use cases look like rather than smaller 
feedback on the ML.


Thanks,
-Nikhil

From: Mike Bayer mba...@redhat.com
Sent: Tuesday, April 21, 2015 9:45 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters



On 4/21/15 2:47 AM, Ajaya Agrawal wrote:
Hi All,

I see that glance supports arbitrary sort parameters and the default is 
created_at while listing images. Is there any reason why we don't have index 
over these fields? If we have an index over these fields then we would avoid a 
full table scan to do sorting. IMO at least the created_at field should have an 
index on it.
just keep in mind that more indexes will place a performance penalty on INSERT 
statements, particularly at larger volumes.  I have no idea if that is 
important here but something to keep in mind.





Cheers,
Ajaya



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribemailto:openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [glance] Why no DB index on sort parameters

2015-04-21 Thread Flavio Percoco

On 21/04/15 14:39 +, Nikhil Komawar wrote:

This is a good idea. We recently removed a unique constraint that may result
into some queries being very slow especially those that involve name
property. I would recommend sketching out a spec that identifies potential full
table scans especially for queries that join over image_properties table.


We should discuss there what other use cases look like rather than smaller
feedback on the ML.


More thatn a spec, I'd be interested in seeing the patch with the
change up and the results reported in Rally.

I guess we'll need a spec anyway, although I'd probably be ok with a
good bug report here.

/me *shrugs*
Flavio




Thanks,
-Nikhil
━━━
From: Mike Bayer mba...@redhat.com
Sent: Tuesday, April 21, 2015 9:45 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [glance] Why no DB index on sort parameters



On 4/21/15 2:47 AM, Ajaya Agrawal wrote:

   Hi All,

   I see that glance supports arbitrary sort parameters and the default is
   created_at while listing images. Is there any reason why we don't have
   index over these fields? If we have an index over these fields then we
   would avoid a full table scan to do sorting. IMO at least the created_at
   field should have an index on it.

just keep in mind that more indexes will place a performance penalty on INSERT
statements, particularly at larger volumes.  I have no idea if that is
important here but something to keep in mind.






   Cheers,
   Ajaya

  


   __
   OpenStack Development Mailing List (not for usage questions)
   Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
   http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--
@flaper87
Flavio Percoco


pgpwzvNpPX67O.pgp
Description: PGP signature
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [glance] Why no DB index on sort parameters

2015-04-21 Thread Mike Bayer



On 4/21/15 2:47 AM, Ajaya Agrawal wrote:

Hi All,

I see that glance supports arbitrary sort parameters and the default 
is created_at while listing images. Is there any reason why we don't 
have index over these fields? If we have an index over these fields 
then we would avoid a full table scan to do sorting. IMO at least the 
created_at field should have an index on it.
just keep in mind that more indexes will place a performance penalty on 
INSERT statements, particularly at larger volumes.  I have no idea if 
that is important here but something to keep in mind.







Cheers,
Ajaya


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev