Re: ResultsDB 2.0 - DB migration on DEV
Estimate on the PROD migration finish is in about 24 hours from now. STG was seamless, so I'm not expecting any troubles here either. On Wed, Jan 25, 2017 at 10:47 AM, Josef Skladanka wrote: > STG is done (took about 15 hours), starting the archive migration for > PROD, and I'll start figuring way to merge the data. Probably tomorrow. > > On Tue, Jan 24, 2017 at 5:49 PM, Josef Skladanka > wrote: > >> So I started the data migration for the STG archives - should be done in >> about 15 hours from now (running for cca six hours already) - estimated on >> the number of results that were already converted. >> If that goes well, I'll start the PROD archives migration tomorrow, and >> start working on merging the archives with the "base". >> If nothing goes sideways, we should have all the data in one place by the >> end of this week. >> >> J. >> > > ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: ResultsDB 2.0 - DB migration on DEV
STG is done (took about 15 hours), starting the archive migration for PROD, and I'll start figuring way to merge the data. Probably tomorrow. On Tue, Jan 24, 2017 at 5:49 PM, Josef Skladanka wrote: > So I started the data migration for the STG archives - should be done in > about 15 hours from now (running for cca six hours already) - estimated on > the number of results that were already converted. > If that goes well, I'll start the PROD archives migration tomorrow, and > start working on merging the archives with the "base". > If nothing goes sideways, we should have all the data in one place by the > end of this week. > > J. > ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: ResultsDB 2.0 - DB migration on DEV
So I started the data migration for the STG archives - should be done in about 15 hours from now (running for cca six hours already) - estimated on the number of results that were already converted. If that goes well, I'll start the PROD archives migration tomorrow, and start working on merging the archives with the "base". If nothing goes sideways, we should have all the data in one place by the end of this week. J. ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: ResultsDB 2.0 - DB migration on DEV
So, as we discussed during meeting, I have offloaded the data (for stg) older than half a year to another database. This is how I did it (probably could have been done more efficiently, but hey, this worked, and I'm not postgres expert...): $ pg_dump -Fc resultsdb_stg > resultsdb_stg.dump # dump the resultsdb_stg to file $ createdb -T template0 resultsdb_stg_archive # create new empty database callend resultsdb_stg_archive $ pg_restore -d resultsdb_stg_archive resultsdb_stg.dump # load data from the dump to the resultsbd_stg_archive db $ psql resultsdb_stg_archive =# -- Get the newest result we want to keep in archives =# select id, job_id from result where submit_time<'2016-06-01' order by submit_time desc limit 1; id| job_id -+ 7857664 | 308901 =# -- Since jobs can contain multiple results, let's select the first result with the 'next' job_id (could be done as 'select id, job_id from result where job_id = 308902 order by id limit 1;' too, but this would automagically catch a hole in the job sequence) =# select id, job_id from result where job_id > 308901 order by id limit 1; id| job_id -+ 7857665 | 308902 =# -- delete all the result_data, results, and jobs, starting from what we got in the previous query =# delete from result_data where result_id >= 7857665; =# delete from result where id >= 7857665; =# delete from job where id >= 308902; $ psql resultsdb_stg =# -- since the db's were 'cloned' at the beginning, perform deletion of the inverse set of data than we did in archive =# delete from result_data where result_id < 7857665; =# delete from result where id < 7857665; =# delete from job where id < 308902; On Wed, Dec 7, 2016 at 2:19 PM, Josef Skladanka wrote: > > > On Mon, Dec 5, 2016 at 4:25 PM, Tim Flink wrote: > >> Is there a way we could export the results as a json file or something >> similar? If there is (or if it could be added without too much >> trouble), we would have multiple options: >> > > Sure, adding some kind of export should be doable > > >> >> 1. Dump the contents of the current db, do a partial offline migration >>and finish it during the upgrade outage by export/importing the >>newest data, deleting the production db and importing the offline >>upgraded db. If that still takes too long, create a second postgres >>db containing the offline upgrade, switchover during the outage and >>import the new results since the db was copied. >> >> > I slept just two hours, so this is a bit entangled for me. So - my initial > idea was, that we > - dump the database > - delete most of the results > - do migration on the small data set > > In paralel (or later on), we would > - create a second database (let's call it 'archive') > - import the un-migrated dump > - remove data that is in the production db > - run the lenghty migration > > This way, we have minimal downtime, and the data are available in the > 'archive' db, > > With the archive db, we could either > 1) dump the data and then import it to the prod db (again no down-time) > 2) just spawn another resultsdb (archives.resultsdb?) instance, that would > operate on top of the archives > > I'd rather do the second, since it also has the benefit of being able to > offload old data > to the 'archive' database (which would/could be 'slow by definition'), > while keeping the 'active' dataset > small enough, that it could all be in memory for fast queries,. > > What do you think? I guess we wanted to do something pretty similar, I > just got lost a bit in what you wrote :) > > > >> 2. If the import/export process is fast enough, might be able to do >>instead of the inplace migration >> > > My gut feeling is that it would be pretty slow, but I have no relevant > experience. > > Joza > > ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: ResultsDB 2.0 - DB migration on DEV
On Mon, Dec 5, 2016 at 4:25 PM, Tim Flink wrote: > Is there a way we could export the results as a json file or something > similar? If there is (or if it could be added without too much > trouble), we would have multiple options: > Sure, adding some kind of export should be doable > > 1. Dump the contents of the current db, do a partial offline migration >and finish it during the upgrade outage by export/importing the >newest data, deleting the production db and importing the offline >upgraded db. If that still takes too long, create a second postgres >db containing the offline upgrade, switchover during the outage and >import the new results since the db was copied. > > I slept just two hours, so this is a bit entangled for me. So - my initial idea was, that we - dump the database - delete most of the results - do migration on the small data set In paralel (or later on), we would - create a second database (let's call it 'archive') - import the un-migrated dump - remove data that is in the production db - run the lenghty migration This way, we have minimal downtime, and the data are available in the 'archive' db, With the archive db, we could either 1) dump the data and then import it to the prod db (again no down-time) 2) just spawn another resultsdb (archives.resultsdb?) instance, that would operate on top of the archives I'd rather do the second, since it also has the benefit of being able to offload old data to the 'archive' database (which would/could be 'slow by definition'), while keeping the 'active' dataset small enough, that it could all be in memory for fast queries,. What do you think? I guess we wanted to do something pretty similar, I just got lost a bit in what you wrote :) > 2. If the import/export process is fast enough, might be able to do >instead of the inplace migration > My gut feeling is that it would be pretty slow, but I have no relevant experience. Joza ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: ResultsDB 2.0 - DB migration on DEV
On Fri, 25 Nov 2016 13:01:51 +0100 Josef Skladanka wrote: > So, I have performed the migration on DEV - there were some problems > with it going out of memory, so I had to tweak it a bit (please have > a look at D1059, that is what I ended up using by hot-fixing on DEV). > > There still is a slight problem, though - the migration of DEV took > about 12 hours total, which is a bit unreasonable. Most of the time > was spent in > `alembic/versions/dbfab576c81_change_schema_to_v2_0_step_2.py` lines > 84-93 in D1059. The code takes about 5 seconds to change 1k results. > That would mean at least 15 hours of downtime on PROD, and that, I > think is unreal... > > And since I don't know how to make it faster (tips are most > welcomed), I suggest that we archive most of the data in STG/PROD > before we go forward with the migration. I'd make a complete backup, > and deleted all but the data from the last 3 months (or any other > reasonable time span). > > We can then populate an "archive" database, and migrate it on its own, > should we decide it is worth it (I don't think it is). > > What do you think? While it would be nice to not lose (in the sense that it wouldn't be readily available) all that old data, 15 hours does seem a bit extreme. Is there a way we could export the results as a json file or something similar? If there is (or if it could be added without too much trouble), we would have multiple options: 1. Dump the contents of the current db, do a partial offline migration and finish it during the upgrade outage by export/importing the newest data, deleting the production db and importing the offline upgraded db. If that still takes too long, create a second postgres db containing the offline upgrade, switchover during the outage and import the new results since the db was copied. 2. If the import/export process is fast enough, might be able to do instead of the inplace migration Thoughts on either of these options? Tim pgpfKAmwVGIHv.pgp Description: OpenPGP digital signature ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: ResultsDB 2.0 - DB migration on DEV
- Original Message - > From: "Kamil Paral" > To: "Fedora QA Development" > Sent: Friday, November 25, 2016 2:58:03 PM > Subject: Re: ResultsDB 2.0 - DB migration on DEV > > > > > So, I have performed the migration on DEV - there were some problems with it > going out of memory, so I had to tweak it a bit (please have a look at > D1059, that is what I ended up using by hot-fixing on DEV). > > There still is a slight problem, though - the migration of DEV took about 12 > hours total, which is a bit unreasonable. Most of the time was spent in > `alembic/versions/dbfab576c81_change_schema_to_v2_0_step_2.py` lines 84-93 > in D1059. The code takes about 5 seconds to change 1k results. That would > mean at least 15 hours of downtime on PROD, and that, I think is unreal... > > And since I don't know how to make it faster (tips are most welcomed), I > suggest that we archive most of the data in STG/PROD before we go forward > with the migration. I'd make a complete backup, and deleted all but the data > from the last 3 months (or any other reasonable time span). > > If we can't make it faster, I think this is reasonable. > +1 ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
Re: ResultsDB 2.0 - DB migration on DEV
> So, I have performed the migration on DEV - there were some problems with it > going out of memory, so I had to tweak it a bit (please have a look at > D1059, that is what I ended up using by hot-fixing on DEV). > There still is a slight problem, though - the migration of DEV took about 12 > hours total, which is a bit unreasonable. Most of the time was spent in > `alembic/versions/dbfab576c81_change_schema_to_v2_0_step_2.py` lines 84-93 > in D1059. The code takes about 5 seconds to change 1k results. That would > mean at least 15 hours of downtime on PROD, and that, I think is unreal... > And since I don't know how to make it faster (tips are most welcomed), I > suggest that we archive most of the data in STG/PROD before we go forward > with the migration. I'd make a complete backup, and deleted all but the data > from the last 3 months (or any other reasonable time span). If we can't make it faster, I think this is reasonable. ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org
ResultsDB 2.0 - DB migration on DEV
So, I have performed the migration on DEV - there were some problems with it going out of memory, so I had to tweak it a bit (please have a look at D1059, that is what I ended up using by hot-fixing on DEV). There still is a slight problem, though - the migration of DEV took about 12 hours total, which is a bit unreasonable. Most of the time was spent in `alembic/versions/dbfab576c81_change_schema_to_v2_0_step_2.py` lines 84-93 in D1059. The code takes about 5 seconds to change 1k results. That would mean at least 15 hours of downtime on PROD, and that, I think is unreal... And since I don't know how to make it faster (tips are most welcomed), I suggest that we archive most of the data in STG/PROD before we go forward with the migration. I'd make a complete backup, and deleted all but the data from the last 3 months (or any other reasonable time span). We can then populate an "archive" database, and migrate it on its own, should we decide it is worth it (I don't think it is). What do you think? J. ___ qa-devel mailing list -- qa-devel@lists.fedoraproject.org To unsubscribe send an email to qa-devel-le...@lists.fedoraproject.org