Re: [rt-users] rt-serializer and rt-importer
On Tue, 2014-03-11 at 08:51 -0700, Tim Gustafson wrote: And one last follow-up about this: here are my import statistics. The importer used up to about 800MB of RAM, so better than the serializer, but still pretty high. Both of those (6G for serializing, and 800M for importing) are definitely higher than I'd expect. If you're on 4.2.2, then you have the fixes to explicitly turn off the caching layer. If you're interested in poking further, doing some analysis of the in-memory objects to determine what's leaking would be helpful. Instrumenting the main loop of either of the serializer or importer with Devel::Gladiator::arena_ref_counts to look for what sorts of objects are leaking might be instructive. Also, 108 hours seems like a long time to import records that took only 4 hours to export, and this is on a pretty high-end MySQL server. Is there any way to speed things up a bit? Yeesh, that's pretty bad. I'd try running under Devel::NYTProf to see where the hotspots are -- I don't have any off-the-cuff suggestions. - Alex -- RT Training London, March 19-20 and Dallas May 20-21 http://bestpractical.com/training
Re: [rt-users] rt-serializer and rt-importer
And one last follow-up about this: here are my import statistics. The importer used up to about 800MB of RAM, so better than the serializer, but still pretty high. Also, 108 hours seems like a long time to import records that took only 4 hours to export, and this is on a pretty high-end MySQL server. Is there any way to speed things up a bit? Elapsed time: 108hr 36min Estimated left: 1min 39s == Import of soe.ucsc.edu == Total object counts: 1002724 RT::Transaction 665796 RT::Attachment 257742 RT::Group 182469 RT::GroupMember 62315 RT::Ticket 8996 RT::Attribute 8316 RT::User 4324 RT::Link 988 RT::ObjectCustomFieldValue 312 RT::CustomFieldValue 37 RT::Queue 30 RT::ObjectCustomField 30 RT::CustomField -- Tim Gustafson t...@ucsc.edu 831-459-5354 Baskin Engineering, Room 313A -- RT Training London, March 19-20 and Dallas May 20-21 http://bestpractical.com/training
Re: [rt-users] rt-serializer and rt-importer
I wanted to follow up with this and let you know that I was able to get rt-serializer to finally complete by moving the installation to a machine with considerably more RAM (12GB instead of 2GB). The serializer process maxed out at about 6.3GB of RAM while serializing an 8GB database. This is using RT 4.2.2 on FreeBSD 9.2 with Perl 5.16.3. The serializer reported the following statistics when it finished: Total object counts: 1002902 RT::Transaction 665796 RT::Attachment 257818 RT::Group 182643 RT::GroupMember 62315 RT::Ticket 9027 RT::Attribute 8382 RT::User 4324 RT::Link 988 RT::ObjectCustomFieldValue 312 RT::CustomFieldValue 38 RT::Queue 30 RT::ObjectCustomField 30 RT::CustomField I'm going to try running the importer next, but I suspect that uses a lot less RAM. -- Tim Gustafson t...@ucsc.edu 831-459-5354 Baskin Engineering, Room 313A -- RT Training London, March 19-20 and Dallas May 20-21 http://bestpractical.com/training
Re: [rt-users] rt-serializer and rt-importer
On Thu, Feb 20, 2014 at 10:43:02AM -0800, Tim Gustafson wrote: 1. I see some options in rt-serializer to skip users and groups. Does this mean that any users who are attached to tickets, either as a requester, owner, CC or whatever won't be imported during rt-importer? The docs cover this pretty well: http://bestpractical.com/docs/rt/latest/rt-serializer.html#no-users Also available from running rt-serializer --help 2. I see an option in rt-importer to save the old ticket organization and ID into a custom field. Will this field be used to look up new e-mails that come in to the system via SMTP? Or if someone replies to an existing ticket e-mail they've received from the original RT system, will that create a new ticket in the new system which will have to be manually merged with the old ticket? We don't ship any code to implement that, however several clients who used the importer have written code to do just that. It should be straightforward to write given RT::Interface::Email::ExtractTicketId as a hook point. -kevin pgp9z4x0o6i39.pgp Description: PGP signature -- RT Training London, March 19-20 and Dallas May 20-21 http://bestpractical.com/training
Re: [rt-users] rt-serializer and rt-importer
The docs cover this pretty well: http://bestpractical.com/docs/rt/latest/rt-serializer.html#no-users Well, the docs say: By default, all privileged users are serialized; passing --no-users limits it to only those users which are strictly necessary. But it does not tell you what strictly necessary means. The documentation specifically calls out privileged users here, but if I'm not interested in maintaining user privileges from the system that I'm exporting from, then does using this parameter mean that those users will be imported only as ticket members, or will privileged users be re-created in the new system as privileged? Or is this a flag to say normally, privileged users are copied even if they're not associated with a ticket, but with this flag they're not copied unless they are associated with a ticket? So, to put my question another way, what's the proper combination of command-line arguments to copy over *only* queues and tickets, and to *not* copy any groups or ACLs? My best guess is: rt-serializer --no-users --no-groups --no-deleted Also, I was running this command against my production data, which has about 80,000 tickets in it, and it consumed obscene amounts of RAM. The rt-serializer hit about 1.3GB of RAM before the machine ran out of swap, even with a log --gc value and a smaller --page value. I even tried a --gc of -1 and that didn't seem to matter either. The whole database for this installation is a bit shy of 10GB, so I'm wondering if I need to run this on a machine with at least that much RAM? And oddly, the dump folder only contained two 32MB +/- files in it when the process crashed, which is causing me some concern. Does that imply that it used 1.3GB of RAM to export 64MB of data? -- Tim Gustafson t...@ucsc.edu 831-459-5354 Baskin Engineering, Room 313A -- RT Training London, March 19-20 and Dallas May 20-21 http://bestpractical.com/training
Re: [rt-users] rt-serializer and rt-importer
On Fri, Feb 21, 2014 at 09:37:46AM -0800, Tim Gustafson wrote: But it does not tell you what strictly necessary means. The Needed to have a complete RT system on the other side. I'm not sure why you think that an RT with only ticket / queue data is useful. You lose all the requestors and owners and ccs/adminccs associated with tickets, which would tend to make the history kind of useless. I understand that you want to 'rework your users and ACLs' but RT will not just magically pair up your brand new users with the old ticket data. --no-users will only export users that it encounters while crawling other objects, rather than defaulting to serializing all privileged users and then serializing unprivileged users as needed. So, to put my question another way, what's the proper combination of command-line arguments to copy over *only* queues and tickets, and to *not* copy any groups or ACLs? My best guess is: I don't believe you can do this. Also, I was running this command against my production data, which has about 80,000 tickets in it, and it consumed obscene amounts of RAM. You don't specify an RT version. 4.2.2 and 4.2.3 both contain fixes for memory usage. I've recently serialized with --clone a larger database in less RAM than you quote. -kevin pgpn4GikJpjFC.pgp Description: PGP signature -- RT Training London, March 19-20 and Dallas May 20-21 http://bestpractical.com/training
Re: [rt-users] rt-serializer and rt-importer
I'm not sure why you think that an RT with only ticket / queue data is useful. You lose all the requestors and owners and ccs/adminccs associated with tickets, which would tend to make the history kind of useless. Sorry, I guess I'm not being clear. That's exactly what I'm talking about: I want to copy over requestors, owners, CCs and so on, but not privileged access or group information. --no-users will only export users that it encounters while crawling other objects, rather than defaulting to serializing all privileged users and then serializing unprivileged users as needed. That was the answer I was looking for, and I think it would be helpful to add it to the documentation. You don't specify an RT version. 4.2.2 and 4.2.3 both contain fixes for memory usage. 4.2.2 -- Tim Gustafson t...@ucsc.edu 831-459-5354 Baskin Engineering, Room 313A -- RT Training London, March 19-20 and Dallas May 20-21 http://bestpractical.com/training
[rt-users] rt-serializer and rt-importer
I'm thrilled to see an export/import mechanism for RT now - thanks! I have three RT instances that I will be merging together using these tools. I have a few questions though: 1. I see some options in rt-serializer to skip users and groups. Does this mean that any users who are attached to tickets, either as a requester, owner, CC or whatever won't be imported during rt-importer? Or will they be created on the fly during rt-importer? Or does this only refer to privileged users and ACL groups? My plan is to re-build the privileged users and ACL groups by hand as we don't have that many of them and the target system uses a different LDAP server via ExternalAuth, so I need to look up the equivalent user names on the new system manually. So, if I don't need to worry about those privileged users and ACL groups, does that mean I can use those options? 2. I see an option in rt-importer to save the old ticket organization and ID into a custom field. Will this field be used to look up new e-mails that come in to the system via SMTP? Or if someone replies to an existing ticket e-mail they've received from the original RT system, will that create a new ticket in the new system which will have to be manually merged with the old ticket? Is there anything else I should be considering when merging two RT installations together? -- Tim Gustafson t...@ucsc.edu 831-459-5354 Baskin Engineering, Room 313A -- RT Training London, March 19-20 and Dallas May 20-21 http://bestpractical.com/training